A Very Small Example
Level | b1 | b2 |
a1 | 3 6 | 4 5 |
a2 | 1 2 | 2 |
A B Y X1 X2 X3 1 1 3 1 1 1 1 1 6 1 1 1 1 2 4 1 -1 -1 1 2 5 1 -1 -1 2 1 1 -1 1 -1 2 1 2 -1 1 -1 2 2 2 -1 -1 1
Check the orthogonality of this very small example. As you can see, our usual algorithm for orthogonal coding does not yield a pairwise orthogonal system for A main effect, B main effect and A*B interaction. This implies that the various sums of squares cannot be estimated independently.
Estimation of Sums of Squares: Two Factor Design
anova y a b a*b
Type 1 Type 2 Type 3 SS(A) SS(A|B) SS(A|B, A*B) SS(B|A) SS(B|A) SS(B|A, A*B) SS(A*B|A, B) SS(A*B|A, B) SS(A*B|A, B)Estimation of Sums of Squares: Three Factor Design
anova y a b c a*b a*c b*c a*b*c
Type 1 Type 2 Type 3 SS(A) SS(A|B,C) SS(A|B,C,A*B,A*C,B*C,A*B*C) SS(B|A) SS(B|A,C) SS(B|A,C,A*B,A*C,B*C,A*B*C) SS(C|A,B) SS(C|A,B) SS(C|A,B,A*B,A*C,B*C,A*B*C) SS(A*B|A,B,C) SS(A*B|A,B,C,A*C,B*C) SS(A*B|A,B,C,A*C,B*C,A*B*C) SS(A*C|A,B,C,A*B) SS(A*C|A,B,C,A*B,B*C) SS(A*C|A,B,C,A*B,B*C,A*B*C) SS(B*C|A,B,C,A*B,A*C) SS(B*C|A,B,C,A*B,A*C) SS(B*C|A,B,C,A*B,A*C,A*B*C) SS(A*B*C|A,B,C,A*B,A*C,B*C) SS(A*B*C|A,B,C,A*B,A*C,B*C) SS(A*B*C|A,B,C,A*B,A*C,B*C)Schematic with Example Data
Level b1
b2 b3 b4
a1
3
6
3
4
5
4
3
3
7
8
7
6
7
8
9
8
a2 1
2
2
2
2
3
4
35
6
5
610
10
9
11
Three ANOVA Summary Tables
Type 1 SS df MS F A 3.125 1 3.125 4.04 B|A 193.931 3 64.644 83.64 A*B|A, B 19.894 3 6.631 8.58 Error 18.550 24 0.77 Total 235.500 31 Type 2 SS df MS F A|B 2.707 1 2.707 3.50 B|A 193.931 3 64.644 83.64 A*B|A, B 19.894 3 6.631 8.58 Error 18.550 24 0.77 Total 235.500 31 Type 3 SS df MS F A|B, A*B 3.199 1 3.199 4.14 B|A, A*B 188.726 3 62.909 81.83 A*B|A, B 19.894 3 6.631 8.58 Error 18.550 24 0.77 Total 235.500 31Using Stata
input y a b x1 x2 x3 x4 3 1 1 1 1 1 1 6 1 1 1 1 1 1 3 1 1 1 1 1 1 1 2 1 -1 1 1 1 2 2 1 -1 1 1 1 2 2 1 -1 1 1 1 2 2 1 -1 1 1 1 4 1 2 1 -1 1 1 5 1 2 1 -1 1 1 4 1 2 1 -1 1 1 3 1 2 1 -1 1 1 3 1 2 1 -1 1 1 2 2 2 -1 -1 1 1 3 2 2 -1 -1 1 1 4 2 2 -1 -1 1 1 3 2 2 -1 -1 1 1 7 1 3 1 0 -2 1 8 1 3 1 0 -2 1 7 1 3 1 0 -2 1 6 1 3 1 0 -2 1 5 2 3 -1 0 -2 1 6 2 3 -1 0 -2 1 5 2 3 -1 0 -2 1 6 2 3 -1 0 -2 1 7 1 4 1 0 0 -3 8 1 4 1 0 0 -3 9 1 4 1 0 0 -3 8 1 4 1 0 0 -3 10 2 4 -1 0 0 -3 10 2 4 -1 0 0 -3 9 2 4 -1 0 0 -3 11 2 4 -1 0 0 -3 end /* Type 1 SS; order a b a#b */ anova y a b a#b, sequential Number of obs = 32 R-squared = 0.9212 Root MSE = .879157 Adj R-squared = 0.8983 Source | Seq. SS df MS F Prob > F -----------+---------------------------------------------------- Model | 216.95 7 30.9928571 40.10 0.0000 | a | 3.125 1 3.125 4.04 0.0557 b | 193.931 3 64.6436667 83.64 0.0000 a#b | 19.894 3 6.63133333 8.58 0.0005 | Residual | 18.55 24 .772916667 -----------+---------------------------------------------------- Total | 235.5 31 7.59677419 /* Type 1 SS; order b a a#b */ anova y b a a#b, sequential Number of obs = 32 R-squared = 0.9212 Root MSE = .879157 Adj R-squared = 0.8983 Source | Seq. SS df MS F Prob > F -----------+---------------------------------------------------- Model | 216.95 7 30.9928571 40.10 0.0000 | b | 194.349206 3 64.7830688 83.82 0.0000 a | 2.70679365 1 2.70679365 3.50 0.0735 a#b | 19.894 3 6.63133333 8.58 0.0005 | Residual | 18.55 24 .772916667 -----------+---------------------------------------------------- Total | 235.5 31 7.59677419 /* Type 2 SS; constructed from two previous analyses */ Number of obs = 32 R-squared = 0.9212 Root MSE = .879157 Adj R-squared = 0.8983 Source | Seq. SS df MS F Prob > F -----------+---------------------------------------------------- Model | 216.95 7 30.9928571 40.10 0.0000 | a | 2.70679365 1 2.70679365 3.50 0.0735 b | 193.931 3 64.6436667 83.64 0.0000 a*b | 19.894 3 6.63133333 8.58 0.0005 | Residual | 18.55 24 .772916667 -----------+---------------------------------------------------- Total | 235.50 31 7.59677419 /* Type 3 SS */ anova y a b a#b Number of obs = 32 R-squared = 0.9212 Root MSE = .879157 Adj R-squared = 0.8983 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 216.95 7 30.9928571 40.10 0.0000 | a | 3.19795082 1 3.19795082 4.14 0.0531 b | 188.726 3 62.9086667 81.39 0.0000 a#b | 19.894 3 6.63133333 8.58 0.0005 | Residual | 18.55 24 .772916667 -----------+---------------------------------------------------- Total | 235.5 31 7.59677419 generate x5=x1*x2 generate x6=x1*x3 generate x7=x1*x4 regress y x1 x2 x3 x4 x5 x6 x7 /* Model: M0 */ Source | SS df MS Number of obs = 32 ---------+------------------------------ F( 7, 24) = 40.10 Model | 216.95 7 30.9928571 Prob > F = 0.0000 Residual | 18.55 24 .772916667 R-squared = 0.9212 ---------+------------------------------ Adj R-squared = 0.8983 Total | 235.50 31 7.59677419 Root MSE = .87916 [remainder of output omitted] regress y x1 /* Model: M1 */ Source | SS df MS Number of obs = 32 ---------+------------------------------ F( 1, 30) = 0.40 Model | 3.125 1 3.125 Prob > F = 0.5301 Residual | 232.375 30 7.74583333 R-squared = 0.0133 ---------+------------------------------ Adj R-squared = -0.0196 Total | 235.50 31 7.59677419 Root MSE = 2.7831 [remainder of output omitted] regress y x2 x3 x4 /* Model: M2 */ Source | SS df MS Number of obs = 32 ---------+------------------------------ F( 3, 28) = 44.08 Model | 194.349206 3 64.7830688 Prob > F = 0.0000 Residual | 41.1507937 28 1.4696712 R-squared = 0.8253 ---------+------------------------------ Adj R-squared = 0.8065 Total | 235.50 31 7.59677419 Root MSE = 1.2123 [remainder of output omitted] regress y x5 x6 x7 /* Model: M3 */ Source | SS df MS Number of obs = 32 ---------+------------------------------ F( 3, 28) = 1.07 Model | 24.2569444 3 8.08564815 Prob > F = 0.3770 Residual | 211.243056 28 7.54439484 R-squared = 0.1030 ---------+------------------------------ Adj R-squared = 0.0069 Total | 235.50 31 7.59677419 Root MSE = 2.7467 [remainder of output omitted] regress y x1 x2 x3 x4 /* Model: M4 */ Source | SS df MS Number of obs = 32 ---------+------------------------------ F( 4, 27) = 34.60 Model | 197.056 4 49.264 Prob > F = 0.0000 Residual | 38.444 27 1.42385185 R-squared = 0.8368 ---------+------------------------------ Adj R-squared = 0.8126 Total | 235.50 31 7.59677419 Root MSE = 1.1933 [remainder of output omitted] regress y x1 x5 x6 x7 /* Model: M5 */ Source | SS df MS Number of obs = 32 -------------+------------------------------ F( 4, 27) = 0.92 Model | 28.224 4 7.056 Prob > F = 0.4672 Residual | 207.276 27 7.67688889 R-squared = 0.1198 -------------+------------------------------ Adj R-squared = -0.0105 Total | 235.50 31 7.59677419 Root MSE = 2.7707 [remainder of output omitted] regress y x2 x3 x4 x5 x6 x7 /* Model: M6 */ Source | SS df MS Number of obs = 32 ---------+------------------------------ F( 6, 25) = 40.95 Model | 213.752049 6 35.6253415 Prob > F = 0.0000 Residual | 21.7479508 25 .869918033 R-squared = 0.9077 ---------+------------------------------ Adj R-squared = 0.8855 Total | 235.50 31 7.59677419 Root MSE = .93269 [remainder of output omitted] Sums of Squares Summary Model: M0 SSA:B:A*B 216.950 (Full Model) Model: M1 SSA 3.125 (A Main Effect) Model: M2 SSB 194.349 (B Main Effect) Model: M3 SSA*B 24.257 (A*B Interaction) Model: M4 SSA:B 197.056 (A, B Main Effects) Model: M5 SSA:A*B 28.224 (A, A*B) Model: M6 SSB:A*B 213.752 (B, A*B) Computing the Three Types of Sums of Squares Type 1 Sums of Squares (sequential) SSA = 3.125 (from M1) [SSA] SSB|A = 197.056 - 3.125 = 193.931 (M4 - M1) [SSA:B - SSA] SSA*B|A,B = 216.950 - 197.056 = 19.894 (M0 - M4) [SSA:B:A*B - SSA:B] Type 2 Sums of Squares SSA|B = 197.056 - 194.349 = 2.707 (M4 - M2) [SSA:B - SSB] SSB|A = 197.056 - 3.125 = 193.931 (M4 - M1) [SSA:B - SSA] SSA*B|A,B = 216.950 - 197.056 = 19.894 (M0 - M4) [SSA:B:A*B - SSA:B] Type 3 Sums of Squares SSA|B,A*B = 216.950 - 213.752 = 3.198 (M0 - M6) [SSA:B:A*B - SSB:A*B] SSB|A,A*B = 216.950 - 28.224 = 188.726 (M0 - M5) [SSA:B:A*B - SSA:A*B] SSA*B|A,B = 216.950 - 197.056 = 19.894 (M0 - M4) [SSA:B:A*B - SSA:B]Using Stata: Continued
m0: regress y x1 x2 x3 x4 x5 x6 x7 m1: regress y x1 m2: regress y x2 x3 x4 m3: regress y x5 x6 x7 m4: regress y x1 x2 x3 x4 m5: regress y x1 x5 x6 x7 m6: regress y x2 x3 x4 x5 x6 x7 Summary of Regression Results Model: M0 R-square 0.9212 (Full Model) Model: M1 R-square 0.0133 (A Main Effect) Model: M2 R-square 0.8253 (B Main Effect) Model: M3 R-square 0.1030 (A*B Interaction) Model: M4 R-square 0.8368 (A, B Main Effects) Model: M5 R-square 0.1198 (A, A*B) Model: M6 R-square 0.9077 (B, A*B)
Computing F-ratios from Regression
F-ratio numerator for A*B|A, B = (R2y.x1-x7 - R2y.x1-x4)/(k1 - k2)
F-ratio denominator for all fixed effects = (1 - R2y.x1-x7)/(N - k1 - 1)
(.9212 - .8368)/(7-4) F = ------------------------ = 8.57 (1 - .9212)/(32 - 7 - 1)
F-ratio numerator for A = R2y.x1/k
.0133/1 F = ------------------------ = 4.05 (1 - .9212)/(32 - 7 - 1)F-ratio numerator for B|A = (R2y.x1-x4 - R2y.x1)/(k2 - k)
(.8368 - .0133)/(4-1) F = ------------------------ = 83.60 (1 - .9212)/(32 - 7 - 1)
F-ratio numerator for A|B = (R2y.x1-x4 - R2y.x2-x4)/(k2 - k)
(.8368 - .8253)/(4-3) F = ------------------------ = 3.50 (1 - .9212)/(32 - 7 - 1)F-ratio numerator for B|A = (R2y.x1-x4 - R2y.x1)/(k2 - k)
(.8368 - .0133)/(4-1) F = ------------------------ = 83.60 (1 - .9212)/(32 - 7 - 1)
F-ratio numerator for A|B, A*B = (R2y.x1-x7 - R2y.x2-x7)/(k1 - k)
(.9212 - .9077)/(7-6) F = ------------------------ = 4.11 (1 - .9212)/(32 - 7 - 1)F-ratio numerator for B|A, A*B = (R2y.x1-x7 - R2y.x1,x5-x7)/(k1 - k)
(.9212 - .1198)/(7-4) F = ------------------------ = 81.63 (1 - .9212)/(32 - 7 - 1)
Linear Statistical Models Course
Phil Ender, 17sep10, 30May00