The anova model versus the regression model

Linear Statistical Models

The Anova Model Versus the Regression Model

Update for Stata 11

The linear models for anova and regression look a little bit different from each other. How might they be related? Let's begin by looking at each model.

Linear Model for Anova

Y_ij = μ + α_j + ε_i(j)

Linear Model for Regression

Y_i = b₀ + b₁X + e_i

First off, since each model is equal to Y, they must be equal to each other. Thus,

b₀ + b₁X + e_i = μ + α_j + ε_i(j)

Now, let's look at a specific example using the hsb2 dataset. We will use write as the outcome variable and female as the predictor variable, giving a two-group anova.

use http://www.philender.com/courses/data/hsb2, clear

tabstat write, by(female) stat(n mean sd)

Summary for variables: write
     by categories of: female 

female |         N      mean        sd
-------+------------------------------
  male |        91  50.12088  10.30516
female |       109  54.99083  8.133715
-------+------------------------------
 Total |       200    52.775  9.478586
--------------------------------------

anova write female

                           Number of obs =     200     R-squared     =  0.0658
                           Root MSE      =  9.1846     Adj R-squared =  0.0611

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |  1176.21384     1  1176.21384      13.94     0.0002
                         |
                  female |  1176.21384     1  1176.21384      13.94     0.0002
                         |
                Residual |  16702.6612   198  84.3568745   
              -----------+----------------------------------------------------
                   Total |   17878.875   199   89.843593 

predict e1, resid

In the anova model, μ is the grand mean and is equal to 52.775. The α_j's are the treatment effects for being in the jth group. They are the difference between the gran mean and the mean of group j. In this example the treatment effects are:

50.12088 - 52.775 = -2.65412 for males

54.99083 - 52.775 = 2.21583 for females.

Next, let's run a regression using a manually generated orthogonal coding.

generate oc = 91 if female==1
replace oc=-109 if female==0

tab oc

         oc |      Freq.     Percent        Cum.
------------+-----------------------------------
       -109 |         91       45.50       45.50
         91 |        109       54.50      100.00
------------+-----------------------------------
      Total |        200      100.00

regress write oc

      Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  1,   198) =   13.94
       Model |  1176.21384     1  1176.21384           Prob > F      =  0.0002
    Residual |  16702.6612   198  84.3568745           R-squared     =  0.0658
-------------+------------------------------           Adj R-squared =  0.0611
       Total |   17878.875   199   89.843593           Root MSE      =  9.1846

------------------------------------------------------------------------------
       write |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          oc |   .0243497    .006521     3.73   0.000     .0114903    .0372092
       _cons |     52.775   .6494493    81.26   0.000     51.49427    54.05573
------------------------------------------------------------------------------

You can see the the constant in this model is equal to the grand mean, therefore,

μ = b₀

predict e2, resid

First, the constant in the regression analysis is equal to the grand mean, so b₀ in the regression model is equal to μ in the anova model allowing us to simplify the equation above.

b₁X + e_i = α_j + ε_i(j)

Next, let's compare the residuals from each model.

summarize e1 e2

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
          e1 |       200    3.90e-08    9.161494  -19.99083   16.87912
          e2 |       200    3.90e-08    9.161494  -19.99083   16.87912

compare e1 e2

                                        ---------- difference ----------
                            count       minimum      average     maximum
------------------------------------------------------------------------
e1=e2                         200
                       ----------
jointly defined               200             0            0           0
                       ----------
total                         200

Since e_i equals ε_i(j) and we can again simplify the equation, leaving us with.

b₁X = α_j

We can now obtain the treatment effects from b₁ and X:

.0243497 * -109 = -2.65412 for males

.0243497 * 91 = 2.215823 for females

Please note that the equivalence between b₁X and α_j does not hold for dummy coding. For the dummy coded model it is true that,

b₀ + b₁X = μ + α_j

However, b₀ does not equal μ, b₀ actually equals the mean of the group coded zero (males). And b₁X does not equal the treatment effect, it equals the difference between the group coded zero and the group coded one.

Linear Statistical Models Course

Phil Ender, 17sep10, 31dec04