MANOVA

Linear Statistical Models

Completely Randomized MANOVA Design

Updated for Stata 11

Its a multivariate world after all...

There are many situations in which you will have access to more than one outcome variable. In those situations you have three options:

Run separate univariate analysis of variance (ANOVA) for each dependent variable
Create a linear combination of the variables and run a single analysis of variance
Run a multivatiate analysis of variance (MANOVA)

The Manova Linear Model

y1 y2 y3 = constant + grp + error

What is different in this model from the univariate model? What elements are the same?

Hypotheses

Treatment Effect
H₀: the populations centroids are equal for all groups
H₁: the populations centroids are not equal for at lest two groups

Assumptions

1. The sets of observations are independent of one another.
2. The variables within each group come from multivariate normal populations.
3. The variance-covariance matrices for each group are equal in the population.

Schematic with Example Data

Level
Group 1 Group 2 Group 3
y11 y21 y31
... ... ...
y1n y2n y3n y11 y21 y31
... ... ...
y1n y2n y3n y11 y21 y31
... ... ...
y1n y2n y3n

Stata Computer Example

input y1 y2 y3 grp
19.6  5.15  9.5 1
15.4  5.75  9.1 1
22.3  4.35  3.3 1
24.3  7.55  5.0 1
22.5  8.50  6.0 1
20.5 10.25  5.0 1
14.1  5.95 18.8 1
13.0  6.30 16.5 1
14.1  5.45  8.9 1
16.7  3.75  6.0 1
16.8  5.10  7.4 1
17.1  9.00  7.5 2
15.7  5.30  8.5 2
14.9  9.85  6.0 2
19.7  3.60  2.9 2
17.2  4.05  0.2 2
16.0  4.40  2.6 2
12.8  7.15  7.0 2
13.6  7.25  3.2 2
14.2  5.30  6.2 2
13.1  3.10  5.5 2
16.5  2.40  6.6 2
16.0  4.55  2.9 3
12.5  2.65  0.7 3
18.5  6.50  5.3 3
19.2  4.85  8.3 3
12.0  8.75  9.0 3
13.0  5.20 10.3 3
11.9  4.75  8.5 3
12.0  5.85  9.5 3
19.8  2.85  2.3 3
16.5  6.55  3.3 3
17.4  6.60  1.9 3
end
 
sort grp
 
by grp: summarize y1 y2 y3

-> grp=        1  
Variable |     Obs        Mean   Std. Dev.       Min        Max
---------+-----------------------------------------------------
      y1 |      11    18.11818   3.903797         13       24.3  
      y2 |      11    6.190909   1.899713       3.75      10.25  
      y3 |      11    8.681818   4.863089        3.3       18.8  

-> grp=        2  
Variable |     Obs        Mean   Std. Dev.       Min        Max
---------+-----------------------------------------------------
      y1 |      11    15.52727   2.075616       12.8       19.7  
      y2 |      11    5.581818   2.434263        2.4       9.85  
      y3 |      11    5.109091   2.531187         .2        8.5  

-> grp=        3  
Variable |     Obs        Mean   Std. Dev.       Min        Max
---------+-----------------------------------------------------
      y1 |      11    15.34545   3.138268       11.9       19.8  
      y2 |      11    5.372727   1.759029       2.65       8.75  
      y3 |      11    5.636364   3.546907         .7       10.3 

 
manova y1 y2 y3 = grp

                           Number of obs =      33

                           W = Wilks' lambda      L = Lawley-Hotelling trace
                           P = Pillai's trace     R = Roy's largest root

                  Source |  Statistic     df   F(df1,    df2) =   F   Prob>F
              -----------+--------------------------------------------------
                     grp | W   0.5258      2     6.0    56.0     3.54 0.0049 e
                         | P   0.4767            6.0    58.0     3.02 0.0122 a
                         | L   0.8972            6.0    54.0     4.04 0.0021 a
                         | R   0.8920            3.0    29.0     8.62 0.0003 u
                         |--------------------------------------------------
                Residual |                30
              -----------+--------------------------------------------------
                   Total |                32
              --------------------------------------------------------------
                           e = exact, a = approximate, u = upper bound on F
 
forvalues i=1/3 {
  display
  display "anova for y`i'"
  display
  anova y`i' grp
}

anova for y1


                           Number of obs =      33     R-squared     =  0.1526
                           Root MSE      = 3.13031     Adj R-squared =  0.0961

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |  52.9242378     2  26.4621189       2.70     0.0835
                         |
                     grp |  52.9242378     2  26.4621189       2.70     0.0835
                         |
                Residual |  293.965442    30  9.79884808   
              -----------+----------------------------------------------------
                   Total |   346.88968    32  10.8403025   

anova for y2


                           Number of obs =      33     R-squared     =  0.0305
                           Root MSE      = 2.05173     Adj R-squared = -0.0341

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |  3.97515121     2   1.9875756       0.47     0.6282
                         |
                     grp |  3.97515121     2   1.9875756       0.47     0.6282
                         |
                Residual |  126.287277    30  4.20957589   
              -----------+----------------------------------------------------
                   Total |  130.262428    32  4.07070087   

anova for y3


                           Number of obs =      33     R-squared     =  0.1610
                           Root MSE      = 3.76993     Adj R-squared =  0.1051

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |  81.8296936     2  40.9148468       2.88     0.0718
                         |
                     grp |  81.8296936     2  40.9148468       2.88     0.0718
                         |
                Residual |  426.370896    30  14.2123632   
              -----------+----------------------------------------------------
                   Total |   508.20059    32  15.8812684

Interestingly, the multivariate F-ratio was significant but none of the univariate F's were. A better tool for looking at the multivariate effects is to use simultaneous confidence intervals.

simulci y1 y2 y3, by(grp) cv(.31)

s=2  m=0  n=13 cv= .31

group variable:   grp

                                    pairwise simultaneous
comparison           difference      confidence intervals
dv: y1
grp 1 vs grp 2        2.591*        0.292         4.889
grp 1 vs grp 3        2.773*        0.474         5.071
grp 2 vs grp 3        0.182        -2.117         2.480

dv: y2
grp 1 vs grp 2        0.609        -0.897         2.116
grp 1 vs grp 3        0.818        -0.688         2.325
grp 2 vs grp 3        0.209        -1.297         1.716

dv: y3
grp 1 vs grp 2        3.573*        0.805         6.341
grp 1 vs grp 3        3.045*        0.277         5.814
grp 2 vs grp 3       -0.527        -3.295         2.241

We see from these results that variables y1 and y3 display significant effects when looking at the differences by groups 1 & 2 and 1 & 3.

Example Using HSB2

use http://www.philender.com/courses/data/hsb2, clear

manova read write math science = prog

                           Number of obs =     200

                           W = Wilks' lambda      L = Lawley-Hotelling trace
                           P = Pillai's trace     R = Roy's largest root

                  Source |  Statistic     df   F(df1,    df2) =   F   Prob>F
              -----------+--------------------------------------------------
                    prog | W   0.6942      2     8.0   388.0     9.71 0.0000 e
                         | P   0.3134            8.0   390.0     9.06 0.0000 a
                         | L   0.4296            8.0   386.0    10.36 0.0000 a
                         | R   0.4023            4.0   195.0    19.61 0.0000 u
                         |--------------------------------------------------
                Residual |               197
              -----------+--------------------------------------------------
                   Total |               199
              --------------------------------------------------------------
                           e = exact, a = approximate, u = upper bound on F
 
simulci read write math science, by(prog) cv(.075)

s=2  m=.5  n=96 cv= .075

group variable:   prog

                                    pairwise simultaneous
comparison           difference      confidence intervals
dv: read
prog 1 vs prog 2       -6.406       -13.876         1.063
prog 1 vs prog 3        3.556        -3.914        11.025
prog 2 vs prog 3        9.962*        2.492        17.431

dv: write
prog 1 vs prog 2       -4.924       -11.829         1.982
prog 1 vs prog 3        4.573        -2.332        11.479
prog 2 vs prog 3        9.497*        2.592        16.403

dv: math
prog 1 vs prog 2       -6.711*      -13.319        -0.103
prog 1 vs prog 3        3.602        -3.006        10.210
prog 2 vs prog 3       10.313*        3.705        16.921

dv: science
prog 1 vs prog 2       -1.356        -9.000         6.289
prog 1 vs prog 3        5.224        -2.420        12.869
prog 2 vs prog 3        6.580        -1.065        14.225

Factorial Manova Example

use http://www.philender.com/courses/data/hsb2, clear

manova read math science = female prog female#prog

                           Number of obs =     200

                           W = Wilks' lambda      L = Lawley-Hotelling trace
                           P = Pillai's trace     R = Roy's largest root

                  Source |  Statistic     df   F(df1,    df2) =   F   Prob>F
             ------------+--------------------------------------------------
                   Model | W   0.6719      5    15.0   530.4     5.48 0.0000 a
                         | P   0.3516           15.0   582.0     5.15 0.0000 a
                         | L   0.4541           15.0   572.0     5.77 0.0000 a
                         | R   0.3665            5.0   194.0    14.22 0.0000 u
                         |--------------------------------------------------
                Residual |               194
             ------------+--------------------------------------------------
                  female | W   0.9823      1     3.0   192.0     1.15 0.3283 e
                         | P   0.0177            3.0   192.0     1.15 0.3283 e
                         | L   0.0180            3.0   192.0     1.15 0.3283 e
                         | R   0.0180            3.0   192.0     1.15 0.3283 e
                         |--------------------------------------------------
                    prog | W   0.7177      2     6.0   384.0    11.55 0.0000 e
                         | P   0.2892            6.0   386.0    10.87 0.0000 a
                         | L   0.3839            6.0   382.0    12.22 0.0000 a
                         | R   0.3573            3.0   193.0    22.99 0.0000 u
                         |--------------------------------------------------
             female#prog | W   0.9586      2     6.0   384.0     1.37 0.2273 e
                         | P   0.0416            6.0   386.0     1.37 0.2268 a
                         | L   0.0429            6.0   382.0     1.36 0.2278 a
                         | R   0.0353            3.0   193.0     2.27 0.0819 u
                         |--------------------------------------------------
                Residual |               194
             ------------+--------------------------------------------------
                   Total |               199
             ---------------------------------------------------------------
                           e = exact, a = approximate, u = upper bound on F

Only the multivariate test of the prog main effect was statistically significant.

Linear Statistical Models Course

Phil Ender, 17sep00, 26apr00

1.	The sets of observations are independent of one another.
2.	The variables within each group come from multivariate normal populations.
3.	The variance-covariance matrices for each group are equal in the population.

Level
Group 1	Group 2	Group 3
y11 y21 y31 ... ... ... y1n y2n y3n	y11 y21 y31 ... ... ... y1n y2n y3n	y11 y21 y31 ... ... ... y1n y2n y3n