Its a multivariate world after all...
There are many situations in which you will have access to more than one outcome variable. In those situations you have three options:
The Manova Linear Model
Hypotheses
Assumptions
1. | The sets of observations are independent of one another. |
2. | The variables within each group come from multivariate normal populations. |
3. | The variance-covariance matrices for each group are equal in the population. |
Schematic with Example Data
Level
Group 1 Group 2 Group 3
y11 y21 y31
... ... ...
y1n y2n y3n
y11 y21 y31
... ... ...
y1n y2n y3n
y11 y21 y31
... ... ...
y1n y2n y3n
Stata Computer Example
input y1 y2 y3 grp 19.6 5.15 9.5 1 15.4 5.75 9.1 1 22.3 4.35 3.3 1 24.3 7.55 5.0 1 22.5 8.50 6.0 1 20.5 10.25 5.0 1 14.1 5.95 18.8 1 13.0 6.30 16.5 1 14.1 5.45 8.9 1 16.7 3.75 6.0 1 16.8 5.10 7.4 1 17.1 9.00 7.5 2 15.7 5.30 8.5 2 14.9 9.85 6.0 2 19.7 3.60 2.9 2 17.2 4.05 0.2 2 16.0 4.40 2.6 2 12.8 7.15 7.0 2 13.6 7.25 3.2 2 14.2 5.30 6.2 2 13.1 3.10 5.5 2 16.5 2.40 6.6 2 16.0 4.55 2.9 3 12.5 2.65 0.7 3 18.5 6.50 5.3 3 19.2 4.85 8.3 3 12.0 8.75 9.0 3 13.0 5.20 10.3 3 11.9 4.75 8.5 3 12.0 5.85 9.5 3 19.8 2.85 2.3 3 16.5 6.55 3.3 3 17.4 6.60 1.9 3 end sort grp by grp: summarize y1 y2 y3 -> grp= 1 Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- y1 | 11 18.11818 3.903797 13 24.3 y2 | 11 6.190909 1.899713 3.75 10.25 y3 | 11 8.681818 4.863089 3.3 18.8 -> grp= 2 Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- y1 | 11 15.52727 2.075616 12.8 19.7 y2 | 11 5.581818 2.434263 2.4 9.85 y3 | 11 5.109091 2.531187 .2 8.5 -> grp= 3 Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- y1 | 11 15.34545 3.138268 11.9 19.8 y2 | 11 5.372727 1.759029 2.65 8.75 y3 | 11 5.636364 3.546907 .7 10.3 manova y1 y2 y3 = grp Number of obs = 33 W = Wilks' lambda L = Lawley-Hotelling trace P = Pillai's trace R = Roy's largest root Source | Statistic df F(df1, df2) = F Prob>F -----------+-------------------------------------------------- grp | W 0.5258 2 6.0 56.0 3.54 0.0049 e | P 0.4767 6.0 58.0 3.02 0.0122 a | L 0.8972 6.0 54.0 4.04 0.0021 a | R 0.8920 3.0 29.0 8.62 0.0003 u |-------------------------------------------------- Residual | 30 -----------+-------------------------------------------------- Total | 32 -------------------------------------------------------------- e = exact, a = approximate, u = upper bound on F forvalues i=1/3 { display display "anova for y`i'" display anova y`i' grp } anova for y1 Number of obs = 33 R-squared = 0.1526 Root MSE = 3.13031 Adj R-squared = 0.0961 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 52.9242378 2 26.4621189 2.70 0.0835 | grp | 52.9242378 2 26.4621189 2.70 0.0835 | Residual | 293.965442 30 9.79884808 -----------+---------------------------------------------------- Total | 346.88968 32 10.8403025 anova for y2 Number of obs = 33 R-squared = 0.0305 Root MSE = 2.05173 Adj R-squared = -0.0341 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 3.97515121 2 1.9875756 0.47 0.6282 | grp | 3.97515121 2 1.9875756 0.47 0.6282 | Residual | 126.287277 30 4.20957589 -----------+---------------------------------------------------- Total | 130.262428 32 4.07070087 anova for y3 Number of obs = 33 R-squared = 0.1610 Root MSE = 3.76993 Adj R-squared = 0.1051 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 81.8296936 2 40.9148468 2.88 0.0718 | grp | 81.8296936 2 40.9148468 2.88 0.0718 | Residual | 426.370896 30 14.2123632 -----------+---------------------------------------------------- Total | 508.20059 32 15.8812684
Interestingly, the multivariate F-ratio was significant but none of the univariate F's were. A better tool for looking at the multivariate effects is to use simultaneous confidence intervals.
simulci y1 y2 y3, by(grp) cv(.31) s=2 m=0 n=13 cv= .31 group variable: grp pairwise simultaneous comparison difference confidence intervals dv: y1 grp 1 vs grp 2 2.591* 0.292 4.889 grp 1 vs grp 3 2.773* 0.474 5.071 grp 2 vs grp 3 0.182 -2.117 2.480 dv: y2 grp 1 vs grp 2 0.609 -0.897 2.116 grp 1 vs grp 3 0.818 -0.688 2.325 grp 2 vs grp 3 0.209 -1.297 1.716 dv: y3 grp 1 vs grp 2 3.573* 0.805 6.341 grp 1 vs grp 3 3.045* 0.277 5.814 grp 2 vs grp 3 -0.527 -3.295 2.241
We see from these results that variables y1 and y3 display significant effects when looking at the differences by groups 1 & 2 and 1 & 3.
Example Using HSB2
use http://www.philender.com/courses/data/hsb2, clear manova read write math science = prog Number of obs = 200 W = Wilks' lambda L = Lawley-Hotelling trace P = Pillai's trace R = Roy's largest root Source | Statistic df F(df1, df2) = F Prob>F -----------+-------------------------------------------------- prog | W 0.6942 2 8.0 388.0 9.71 0.0000 e | P 0.3134 8.0 390.0 9.06 0.0000 a | L 0.4296 8.0 386.0 10.36 0.0000 a | R 0.4023 4.0 195.0 19.61 0.0000 u |-------------------------------------------------- Residual | 197 -----------+-------------------------------------------------- Total | 199 -------------------------------------------------------------- e = exact, a = approximate, u = upper bound on F simulci read write math science, by(prog) cv(.075) s=2 m=.5 n=96 cv= .075 group variable: prog pairwise simultaneous comparison difference confidence intervals dv: read prog 1 vs prog 2 -6.406 -13.876 1.063 prog 1 vs prog 3 3.556 -3.914 11.025 prog 2 vs prog 3 9.962* 2.492 17.431 dv: write prog 1 vs prog 2 -4.924 -11.829 1.982 prog 1 vs prog 3 4.573 -2.332 11.479 prog 2 vs prog 3 9.497* 2.592 16.403 dv: math prog 1 vs prog 2 -6.711* -13.319 -0.103 prog 1 vs prog 3 3.602 -3.006 10.210 prog 2 vs prog 3 10.313* 3.705 16.921 dv: science prog 1 vs prog 2 -1.356 -9.000 6.289 prog 1 vs prog 3 5.224 -2.420 12.869 prog 2 vs prog 3 6.580 -1.065 14.225Factorial Manova Example
use http://www.philender.com/courses/data/hsb2, clear manova read math science = female prog female#prog Number of obs = 200 W = Wilks' lambda L = Lawley-Hotelling trace P = Pillai's trace R = Roy's largest root Source | Statistic df F(df1, df2) = F Prob>F ------------+-------------------------------------------------- Model | W 0.6719 5 15.0 530.4 5.48 0.0000 a | P 0.3516 15.0 582.0 5.15 0.0000 a | L 0.4541 15.0 572.0 5.77 0.0000 a | R 0.3665 5.0 194.0 14.22 0.0000 u |-------------------------------------------------- Residual | 194 ------------+-------------------------------------------------- female | W 0.9823 1 3.0 192.0 1.15 0.3283 e | P 0.0177 3.0 192.0 1.15 0.3283 e | L 0.0180 3.0 192.0 1.15 0.3283 e | R 0.0180 3.0 192.0 1.15 0.3283 e |-------------------------------------------------- prog | W 0.7177 2 6.0 384.0 11.55 0.0000 e | P 0.2892 6.0 386.0 10.87 0.0000 a | L 0.3839 6.0 382.0 12.22 0.0000 a | R 0.3573 3.0 193.0 22.99 0.0000 u |-------------------------------------------------- female#prog | W 0.9586 2 6.0 384.0 1.37 0.2273 e | P 0.0416 6.0 386.0 1.37 0.2268 a | L 0.0429 6.0 382.0 1.36 0.2278 a | R 0.0353 3.0 193.0 2.27 0.0819 u |-------------------------------------------------- Residual | 194 ------------+-------------------------------------------------- Total | 199 --------------------------------------------------------------- e = exact, a = approximate, u = upper bound on FOnly the multivariate test of the prog main effect was statistically significant.
Linear Statistical Models Course
Phil Ender, 17sep00, 26apr00