Analysis of Covariance
Linear Model
Hypotheses
Assumptions
Selecting a Covariate
Schematic with Example Data
a1 a2 a3 a4
Y X Y X Y X Y X
3 42
6 57
3 33
3 47
1 32
2 35
2 33
2 39
4 47
5 49
4 42
3 41
2 38
3 43
4 48
3 45
7 61
8 65
7 64
6 56
5 52
6 58
5 53
6 54
7 65
8 74
9 80
8 73
10 85
10 82
9 78
11 89
ANCOVA Summary Table
Source | SS | df | MS | F | Error Term | |
1 | Covariate | 33.950 | 1 | 33.950 | 130.09 | [3] |
2 | A | 1.793 | 3 | 0.598 | 2.29 | [3] |
3 | Error | 7.047 | 27 | 0.261 | ||
Adj Total | 8.840 | 30 | ||||
Grand Total | 235.500 | 31 |
Source | SS | df | MS | F | |
A | 194.5 | 3 | 64.833 | 44.28 | |
Error | 41.0 | 28 | 1.464 | ||
Total | 235.5 | 31 |
Comparing ANCOVA with Randomized Block Designs
Some Stata Tricks
One Factor Design with one Covariate: | |
anova y a | analysis of variance |
anova y x a | analysis of covariance |
anova y x a x*a | tests homogeneity of slopes |
Two Factor Design with One Covariate: | |
anova y a b a*b | analysis of variance |
anova y x a b a*b | analysis of covariance |
anova y x a b a*b x*a*b | tests homogeneity of slopes |
One Factor Design with Two Covariates: | |
anova y a | analysis of variance |
anova y x z a | analysis of covariance |
anova y x a x*a | homogeneity of x slopes |
anova y z a z*a | homogeneity of z slopes |
Two Factor Design with Two Covariates: | |
anova y a b a*b | analysis of variance |
anova y x z a b a*b | analysis of covariance |
anova y x a b a*b x*a*b | homogeneity of x slopes |
anova y z a b a*b z*a*b | homogeneity of z slopes |
Note: Don't forget the | cont option in the ancova |
Stata Example
input x y a x1 x2 x3 42 3 1 1 1 1 57 6 1 1 1 1 33 3 1 1 1 1 47 3 1 1 1 1 32 1 1 1 1 1 35 2 1 1 1 1 33 2 1 1 1 1 39 2 1 1 1 1 47 4 2 -1 1 1 49 5 2 -1 1 1 42 4 2 -1 1 1 41 3 2 -1 1 1 38 2 2 -1 1 1 43 3 2 -1 1 1 48 4 2 -1 1 1 45 3 2 -1 1 1 61 7 3 0 -2 1 65 8 3 0 -2 1 64 7 3 0 -2 1 56 6 3 0 -2 1 52 5 3 0 -2 1 58 6 3 0 -2 1 53 5 3 0 -2 1 54 6 3 0 -2 1 65 7 4 0 0 -3 74 8 4 0 0 -3 80 9 4 0 0 -3 73 8 4 0 0 -3 85 10 4 0 0 -3 82 10 4 0 0 -3 78 9 4 0 0 -3 89 11 4 0 0 -3 end anova y a c.x Number of obs = 32 R-squared = 0.9701 Root MSE = .510876 Adj R-squared = 0.9656 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 228.453154 4 57.1132885 218.83 0.0000 | a | 1.79283521 3 .597611737 2.29 0.1010 x | 33.9531542 1 33.9531542 130.09 0.0000 | Residual | 7.04684582 27 .26099429 -----------+---------------------------------------------------- Total | 235.5 31 7.59677419 margins a, asbalanced Predictive margins Number of obs = 32 Expression : Linear prediction, predict() ------------------------------------------------------------------------------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- a | 1 | 5.310127 .2881078 18.43 0.000 4.745446 5.874807 2 | 5.325664 .2413402 22.07 0.000 4.852646 5.798682 3 | 5.767353 .1855126 31.09 0.000 5.403755 6.130951 4 | 5.096856 .3869503 13.17 0.000 4.338448 5.855265 ------------------------------------------------------------------------------ /* pairwise comparisons using anovalator */ /* these tests have not been adjusted for multiplicity */ anovalator a, pair quietly anovalator pairwise comparisons for a Comparison Coef. Std. Err. z P>|z| [95% Conf. Interval] 1 vs 2 -.0155375 .26343 -.059 0.953 -.5318595 .5007846 1 vs 3 -.457227 .369347 -1.24 0.216 -1.181147 .2666942 1 vs 4 .21327 .621578 .343 0.732 -1.005023 1.431564 2 vs 3 -.441689 .325894 -1.36 0.175 -1.080441 .1970623 2 vs 4 .228808 .563495 .406 0.685 -.8756423 1.333258 3 vs 4 .670497 .393934 1.7 0.089 -.1016129 1.442607 /* test for homogeneity of regression slopes */ anova y a c.x a#c.x Number of obs = 32 R-squared = 0.9719 Root MSE = .525009 Adj R-squared = 0.9637 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 228.884782 7 32.6978259 118.63 0.0000 | a | .355072259 3 .11835742 0.43 0.7338 x | 25.8488494 1 25.8488494 93.78 0.0000 a#x | .431627333 3 .143875778 0.52 0.6713 | Residual | 6.61521849 24 .275634104 -----------+---------------------------------------------------- Total | 235.5 31 7.59677419
Stata Example Continued
regress y x x1 x2 x3 Source | SS df MS Number of obs = 32 ---------+------------------------------ F( 4, 27) = 218.83 Model | 228.453154 4 57.1132885 Prob > F = 0.0000 Residual | 7.04684582 27 .26099429 R-squared = 0.9701 ---------+------------------------------ Adj R-squared = 0.9656 Total | 235.50 31 7.59677419 Root MSE = .51088 [remainder of output omitted] regress y x Source | SS df MS Number of obs = 32 ---------+------------------------------ F( 1, 30) = 769.24 Model | 226.660319 1 226.660319 Prob > F = 0.0000 Residual | 8.83968103 30 .294656034 R-squared = 0.9625 ---------+------------------------------ Adj R-squared = 0.9612 Total | 235.50 31 7.59677419 Root MSE = .54282 [remainder of output omitted] regress y x1 x2 x3 Source | SS df MS Number of obs = 32 ---------+------------------------------ F( 3, 28) = 44.28 Model | 194.50 3 64.8333333 Prob > F = 0.0000 Residual | 41.00 28 1.46428571 R-squared = 0.8259 ---------+------------------------------ Adj R-squared = 0.8072 Total | 235.50 31 7.59677419 Root MSE = 1.2101 [remainder of output omitted] Regression Results Summarized Model: M0 R-square 0.9701 Model: M1 R-square 0.9625 Model: M2 R-square 0.8259
F-ratios Using Regression
with 1 and 27 degrees of freedom
with 3 and 27 degrees of freedom
Example with Two Covariates
input id y c1 c2 grp 1 6 1 6 1 2 9 1 7 1 3 8 2 15 1 4 8 3 13 1 5 12 3 18 1 6 12 4 9 1 7 10 4 16 1 8 8 5 10 1 9 12 5 16 1 10 13 6 18 1 11 13 4 12 2 12 16 4 12 2 13 15 5 17 2 14 16 6 9 2 15 19 6 20 2 16 17 8 18 2 17 19 8 16 2 18 23 9 20 2 19 19 10 10 2 20 22 10 17 2 21 20 7 8 3 22 22 7 14 3 23 24 9 11 3 24 26 9 11 3 25 24 10 16 3 26 25 11 20 3 27 28 11 19 3 28 27 12 19 3 29 29 13 12 3 30 26 13 16 3 31 27 7 16 4 32 28 8 10 4 33 25 8 13 4 34 27 9 7 4 35 31 9 15 4 36 29 10 20 4 37 32 10 16 4 38 30 12 21 4 39 32 12 15 4 40 33 14 21 4 end tabstat y c1 c2, by(grp) stat(n mean sd) col(stat) Summary for variables: y c1 c2 by categories of: grp grp | N mean sd ---------+------------------------------ 1 | 10 9.8 2.347576 | 10 3.4 1.712698 | 10 12.8 4.491968 ---------+------------------------------ 2 | 10 17.9 3.107339 | 10 7 2.309401 | 10 15.1 4.040077 ---------+------------------------------ 3 | 10 25.1 2.726414 | 10 10.2 2.20101 | 10 14.6 4.060651 ---------+------------------------------ 4 | 10 29.4 2.633122 | 10 9.9 2.18327 | 10 15.4 4.599517 ---------+------------------------------ Total | 40 20.55 7.977372 | 40 7.625 3.439495 | 40 14.475 4.260658 ---------------------------------------- anova y grp /* 0 covariates */ Number of obs = 40 R-squared = 0.8929 Root MSE = 2.71723 Adj R-squared = 0.8840 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 2216.1 3 738.7 100.05 0.0000 | grp | 2216.1 3 738.7 100.05 0.0000 | Residual | 265.8 36 7.38333333 -----------+---------------------------------------------------- Total | 2481.9 39 63.6384615 anova y grp c.c1 /* 1 covariate */ Number of obs = 40 R-squared = 0.9594 Root MSE = 1.69598 Adj R-squared = 0.9548 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 2381.22741 4 595.306852 206.97 0.0000 | grp | 415.841199 3 138.613733 48.19 0.0000 c1 | 165.127408 1 165.127408 57.41 0.0000 | Residual | 100.672592 35 2.87635976 -----------+---------------------------------------------------- Total | 2481.9 39 63.6384615 anova y grp c.c1 c.c2 /* 2 covariates */ Number of obs = 40 R-squared = 0.9624 Root MSE = 1.65656 Adj R-squared = 0.9569 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 2388.59757 5 477.719513 174.08 0.0000 | grp | 420.189396 3 140.063132 51.04 0.0000 c1 | 98.974038 1 98.974038 36.07 0.0000 c2 | 7.37015734 1 7.37015734 2.69 0.1105 | Residual | 93.3024343 34 2.74418925 -----------+---------------------------------------------------- Total | 2481.9 39 63.6384615 margins grp, asbalanced Predictive margins Number of obs = 40 Expression : Linear prediction, predict() ------------------------------------------------------------------------------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- grp | 1 | 13.78338 .7820854 17.62 0.000 12.25052 15.31624 2 | 18.38456 .537869 34.18 0.000 17.33035 19.43876 3 | 22.77973 .646886 35.21 0.000 21.51186 24.0476 4 | 27.25234 .6098128 44.69 0.000 26.05713 28.44755 ------------------------------------------------------------------------------ /* pairwise comparisons using anovalator */ /* these tests have not been adjusted for multiplicity */ anovalator grp, pair quietly anovalator pairwise comparisons for grp Comparison Coef. Std. Err. z P>|z| [95% Conf. Interval] 1 vs 2 -4.60118 .88207 -5.22 0.000 -6.330038 -2.872323 1 vs 3 -8.99635 1.21035 -7.43 0.000 -11.36863 -6.624072 1 vs 4 -13.469 1.16021 -11.6 0.000 -15.74298 -11.19494 2 vs 3 -4.39517 .891386 -4.93 0.000 -6.142288 -2.648057 2 vs 4 -8.86778 .852674 -10.4 0.000 -10.53902 -7.196539 3 vs 4 -4.47261 .746185 -5.99 0.000 -5.93513 -3.010083 anova y grp c.c1 c.c1#grp /* check homogeneity of regression for c2 */ Number of obs = 40 R-squared = 0.9598 Root MSE = 1.76482 Adj R-squared = 0.9511 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 2382.23359 7 340.319084 109.27 0.0000 | grp | 70.1635717 3 23.3878572 7.51 0.0006 c1 | 152.279387 1 152.279387 48.89 0.0000 grp#c1 | 1.00618243 3 .335394144 0.11 0.9550 | Residual | 99.6664092 32 3.11457529 -----------+---------------------------------------------------- Total | 2481.9 39 63.6384615 anova y grp c.c2 c.c2#grp /* check homogeneity of regression for c2 */ Number of obs = 40 R-squared = 0.9228 Root MSE = 2.44624 Adj R-squared = 0.9060 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 2290.40886 7 327.201265 54.68 0.0000 | grp | 182.057287 3 60.6857623 10.14 0.0001 c2 | 73.6130056 1 73.6130056 12.30 0.0014 grp#c2 | .785330753 3 .261776918 0.04 0.9876 | Residual | 191.491142 32 5.98409817 -----------+---------------------------------------------------- Total | 2481.9 39 63.6384615
Linear Statistical Models Course
Phil Ender, 17sep10, 13may06, 11apr06, 25May00