Chi-Square Formula
Marginals
Males | Females | |||
Monday Night Football | Watch | 65 | 30 | 95 |
Don't Watch | 55 | 50 | 105 | |
120 | 80 | 200 |
Males | Females | |||
Monday Night Football | Watch | 32.5% | 15% | 47.5% |
Don't Watch | 27.5% | 25% | 52.5% | |
60% | 40% | 100% |
Males | Females | |||
Monday Night Football | Watch | 68.4% | 31.6% | 100% |
Don't Watch | 52.4% | 47.6% | 100% |
Males | Females | |||
Monday Night Football | Watch | 54.2% | 37.5% | |
Don't Watch | 45.8% | 62.5% | ||
100% | 100% |
Expected Frequencies Under Independence
Males | Females | |||
Monday Night Football | Watch | 57 | 38 | 95 |
Don't Watch | 63 | 42 | 105 | |
120 | 80 | 200 |
Hypotheses
Example Continued
Males | Females | |||
Monday Night Football | Watch | 65 57 | 30 38 | 95 |
Don't Watch | 55 63 | 50 42 | 105 | |
120 | 80 | 200 |
O | E | O-E | (O-E)2 | (O-E)2/E | |
---|---|---|---|---|---|
65 | 57 | 8 | 64 | 1.12 | |
55 | 63 | -8 | 64 | 1.02 | |
30 | 38 | -8 | 64 | 1.68 | |
50 | 42 | 8 | 64 | 1.52 | |
200 | 200 | Totals | 5.35 | = Chi-square |
Using Stata
tabi 65 30 \ 55 50, exp row col chi2 +--------------------+ | Key | |--------------------| | frequency | | expected frequency | | row percentage | | column percentage | +--------------------+ | col row | 1 2 | Total -----------+----------------------+---------- 1 | 65 30 | 95 | 57.0 38.0 | 95.0 | 68.42 31.58 | 100.00 | 54.17 37.50 | 47.50 -----------+----------------------+---------- 2 | 55 50 | 105 | 63.0 42.0 | 105.0 | 52.38 47.62 | 100.00 | 45.83 62.50 | 52.50 -----------+----------------------+---------- Total | 120 80 | 200 | 120.0 80.0 | 200.0 | 60.00 40.00 | 100.00 | 100.00 100.00 | 100.00 Pearson chi2(1) = 5.3467 Pr = 0.021Example 2:
SES | ||||
Smoking | High | Middle | Low | |
Current | 51 | 22 | 43 | 116 |
Former | 92 | 21 | 28 | 141 |
Never | 68 | 9 | 22 | 99 |
211 | 52 | 93 | 356 |
SES | ||||
Smoking | High | Middle | Low | |
Current | 24.2 | 42.3 | 46.2 | |
Former | 43.6 | 40.4 | 30.1 | |
Never | 32.3 | 17.3 | 23.7 | |
100 | 100 | 100 |
SES | ||||
Smoking | High | Middle | Low | |
Current | 44 | 19 | 37 | 100 |
Former | 65.3 | 14.9 | 19.9 | 100 |
Never | 68.7 | 9.1 | 22.2 | 100 |
SES | ||||
Smoking | High | Middle | Low | |
Current | 68.75 | 16.94 | 30.30 | 116 |
Former | 83.57 | 20.60 | 36.83 | 141 |
Never | 58.68 | 14.46 | 25.86 | 99 |
211 | 52 | 93 | 356 |
O | E | O-E | (O-E)2 | (O-E)2/E | |
---|---|---|---|---|---|
51 | 68.75 | -17.75 | 315.06 | 4.58 | |
22 | 16.94 | 5.06 | 25.50 | 1.51 | |
43 | 30.30 | 12.7 | 161.29 | 5.32 | |
92 | 83.57 | 8.43 | 71.06 | 0.85 | |
21 | 20.60 | 0.4 | 0.16 | 0.01 | |
28 | 36.83 | -8.83 | 77.97 | 2.12 | |
68 | 58.68 | 9.32 | 86.86 | 1.48 | |
9 | 14.26 | -5.26 | 27.67 | 1.94 | |
22 | 25.86 | -3.86 | 14.90 | 0.58 | |
356 | 355.8 | Totals | 18.39 | = Chi-square |
Using Stata
tabi 51 22 43 \ 92 21 28 \ 68 9 22, exp row col chi2 +--------------------+ | Key | |--------------------| | frequency | | expected frequency | | row percentage | | column percentage | +--------------------+ | col row | 1 2 3 | Total -----------+---------------------------------+---------- 1 | 51 22 43 | 116 | 68.8 16.9 30.3 | 116.0 | 43.97 18.97 37.07 | 100.00 | 24.17 42.31 46.24 | 32.58 -----------+---------------------------------+---------- 2 | 92 21 28 | 141 | 83.6 20.6 36.8 | 141.0 | 65.25 14.89 19.86 | 100.00 | 43.60 40.38 30.11 | 39.61 -----------+---------------------------------+---------- 3 | 68 9 22 | 99 | 58.7 14.5 25.9 | 99.0 | 68.69 9.09 22.22 | 100.00 | 32.23 17.31 23.66 | 27.81 -----------+---------------------------------+---------- Total | 211 52 93 | 356 | 211.0 52.0 93.0 | 356.0 | 59.27 14.61 26.12 | 100.00 | 100.00 100.00 100.00 | 100.00 Pearson chi2(4) = 18.5097 Pr = 0.001
More Using Stata
use http://www.philender.com/courses/data/hsb2, clear list female race ses prog, nolabel female race ses prog 1. 0 4 1 1 2. 1 4 2 3 3. 0 4 3 1 4. 0 4 3 3 5. 0 4 2 2 6. 0 4 2 2 7. 0 3 2 1 8. 0 1 2 2 9. 0 4 2 1 10. 0 3 2 2 11. 0 4 2 3 12. 0 4 2 2 13. 0 4 3 2 14. 0 4 3 2 15. 0 3 1 2 16. 0 4 1 1 17. 0 4 3 2 18. 0 4 2 1 19. 0 4 3 2 20. 0 4 2 1 21. 0 4 2 1 22. 0 4 2 3 23. 0 3 2 2 24. 0 1 3 2 25. 0 1 2 3 26. 0 3 2 3 [remainder of output omitted] tab1 female race ses prog -> tabulation of female female | Freq. Percent Cum. ------------+----------------------------------- male | 91 45.50 45.50 female | 109 54.50 100.00 ------------+----------------------------------- Total | 200 100.00 -> tabulation of race race | Freq. Percent Cum. -------------+----------------------------------- hispanic | 24 12.00 12.00 asian | 11 5.50 17.50 african-amer | 20 10.00 27.50 white | 145 72.50 100.00 -------------+----------------------------------- Total | 200 100.00 -> tabulation of ses ses | Freq. Percent Cum. ------------+----------------------------------- low | 47 23.50 23.50 middle | 95 47.50 71.00 high | 58 29.00 100.00 ------------+----------------------------------- Total | 200 100.00 -> tabulation of prog type of | program | Freq. Percent Cum. ------------+----------------------------------- general | 45 22.50 22.50 academic | 105 52.50 75.00 vocation | 50 25.00 100.00 ------------+----------------------------------- Total | 200 100.00 tabulate race ses, exp row col chi2 +--------------------+ | Key | |--------------------| | frequency | | expected frequency | | row percentage | | column percentage | +--------------------+ | ses race | low middle high | Total -------------+---------------------------------+---------- hispanic | 9 11 4 | 24 | 5.6 11.4 7.0 | 24.0 | 37.50 45.83 16.67 | 100.00 | 19.15 11.58 6.90 | 12.00 -------------+---------------------------------+---------- asian | 3 5 3 | 11 | 2.6 5.2 3.2 | 11.0 | 27.27 45.45 27.27 | 100.00 | 6.38 5.26 5.17 | 5.50 -------------+---------------------------------+---------- african-amer | 11 6 3 | 20 | 4.7 9.5 5.8 | 20.0 | 55.00 30.00 15.00 | 100.00 | 23.40 6.32 5.17 | 10.00 -------------+---------------------------------+---------- white | 24 73 48 | 145 | 34.1 68.9 42.0 | 145.0 | 16.55 50.34 33.10 | 100.00 | 51.06 76.84 82.76 | 72.50 -------------+---------------------------------+---------- Total | 47 95 58 | 200 | 47.0 95.0 58.0 | 200.0 | 23.50 47.50 29.00 | 100.00 | 100.00 100.00 100.00 | 100.00 Pearson chi2(6) = 18.5160 Pr = 0.005Note: There is a problem in this analysis concerning low expected values (expected frequencies). The computed value of chi-squared may not be distributed as a chi-squared with six degrees of freedom. We can try collasping the race categories to see if this helps the expected frequencies.
generate nonwhite = race~=4 tab nonwhite nonwhite | Freq. Percent Cum. ------------+----------------------------------- 0 | 145 72.50 72.50 1 | 55 27.50 100.00 ------------+----------------------------------- Total | 200 100.00 tabulate nonwhite ses, exp row col chi2 +--------------------+ | Key | |--------------------| | frequency | | expected frequency | | row percentage | | column percentage | +--------------------+ | ses nonwhite | low middle high | Total -----------+---------------------------------+---------- 0 | 24 73 48 | 145 | 34.1 68.9 42.0 | 145.0 | 16.55 50.34 33.10 | 100.00 | 51.06 76.84 82.76 | 72.50 -----------+---------------------------------+---------- 1 | 23 22 10 | 55 | 12.9 26.1 15.9 | 55.0 | 41.82 40.00 18.18 | 100.00 | 48.94 23.16 17.24 | 27.50 -----------+---------------------------------+---------- Total | 47 95 58 | 200 | 47.0 95.0 58.0 | 200.0 | 23.50 47.50 29.00 | 100.00 | 100.00 100.00 100.00 | 100.00 Pearson chi2(2) = 14.7922 Pr = 0.001
tabulate prog ses, exp row col chi2 +--------------------+ | Key | |--------------------| | frequency | | expected frequency | | row percentage | | column percentage | +--------------------+ type of | ses program | low middle high | Total -----------+---------------------------------+---------- general | 16 20 9 | 45 | 10.6 21.4 13.1 | 45.0 | 35.56 44.44 20.00 | 100.00 | 34.04 21.05 15.52 | 22.50 -----------+---------------------------------+---------- academic | 19 44 42 | 105 | 24.7 49.9 30.4 | 105.0 | 18.10 41.90 40.00 | 100.00 | 40.43 46.32 72.41 | 52.50 -----------+---------------------------------+---------- vocation | 12 31 7 | 50 | 11.8 23.8 14.5 | 50.0 | 24.00 62.00 14.00 | 100.00 | 25.53 32.63 12.07 | 25.00 -----------+---------------------------------+---------- Total | 47 95 58 | 200 | 47.0 95.0 58.0 | 200.0 | 23.50 47.50 29.00 | 100.00 | 100.00 100.00 100.00 | 100.00 Pearson chi2(4) = 16.6044 Pr = 0.002 tab2 female ses prog, exp row col chi2 -> tabulation of female by ses +--------------------+ | Key | |--------------------| | frequency | | expected frequency | | row percentage | | column percentage | +--------------------+ | ses female | low middle high | Total -----------+---------------------------------+---------- male | 15 47 29 | 91 | 21.4 43.2 26.4 | 91.0 | 16.48 51.65 31.87 | 100.00 | 31.91 49.47 50.00 | 45.50 -----------+---------------------------------+---------- female | 32 48 29 | 109 | 25.6 51.8 31.6 | 109.0 | 29.36 44.04 26.61 | 100.00 | 68.09 50.53 50.00 | 54.50 -----------+---------------------------------+---------- Total | 47 95 58 | 200 | 47.0 95.0 58.0 | 200.0 | 23.50 47.50 29.00 | 100.00 | 100.00 100.00 100.00 | 100.00 Pearson chi2(2) = 4.5765 Pr = 0.101 -> tabulation of female by prog +--------------------+ | Key | |--------------------| | frequency | | expected frequency | | row percentage | | column percentage | +--------------------+ | type of program female | general academic vocation | Total -----------+---------------------------------+---------- male | 21 47 23 | 91 | 20.5 47.8 22.8 | 91.0 | 23.08 51.65 25.27 | 100.00 | 46.67 44.76 46.00 | 45.50 -----------+---------------------------------+---------- female | 24 58 27 | 109 | 24.5 57.2 27.2 | 109.0 | 22.02 53.21 24.77 | 100.00 | 53.33 55.24 54.00 | 54.50 -----------+---------------------------------+---------- Total | 45 105 50 | 200 | 45.0 105.0 50.0 | 200.0 | 22.50 52.50 25.00 | 100.00 | 100.00 100.00 100.00 | 100.00 Pearson chi2(2) = 0.0528 Pr = 0.974 -> tabulation of ses by prog +--------------------+ | Key | |--------------------| | frequency | | expected frequency | | row percentage | | column percentage | +--------------------+ | type of program ses | general academic vocation | Total -----------+---------------------------------+---------- low | 16 19 12 | 47 | 10.6 24.7 11.8 | 47.0 | 34.04 40.43 25.53 | 100.00 | 35.56 18.10 24.00 | 23.50 -----------+---------------------------------+---------- middle | 20 44 31 | 95 | 21.4 49.9 23.8 | 95.0 | 21.05 46.32 32.63 | 100.00 | 44.44 41.90 62.00 | 47.50 -----------+---------------------------------+---------- high | 9 42 7 | 58 | 13.1 30.4 14.5 | 58.0 | 15.52 72.41 12.07 | 100.00 | 20.00 40.00 14.00 | 29.00 -----------+---------------------------------+---------- Total | 45 105 50 | 200 | 45.0 105.0 50.0 | 200.0 | 22.50 52.50 25.00 | 100.00 | 100.00 100.00 100.00 | 100.00 Pearson chi2(4) = 16.6044 Pr = 0.002
Small Sample Sizes
It is possible to use Fisher's Exact Test with small samples. Fisher's Exact computes the p-value for the test of independence of the row and column variables. Contrary to the belief of some students, it does not compute an exact value of chi-squared. Here is a small example in Stata using the tabi command.
men | woman | |
---|---|---|
dieting | 1 | 9 |
not dieting | 11 | 3 |
tabi 1 9 \ 11 3, expected exact +--------------------+ | Key | |--------------------| | frequency | | expected frequency | +--------------------+ | col row | 1 2 | Total -----------+----------------------+---------- 1 | 1 9 | 10 | 5.0 5.0 | 10.0 -----------+----------------------+---------- 2 | 11 3 | 14 | 7.0 7.0 | 14.0 -----------+----------------------+---------- Total | 12 12 | 24 | 12.0 12.0 | 24.0 Fisher's exact = 0.003 1-sided Fisher's exact = 0.001Other Issues
Intro Home Page
Phil Ender, 9dec05, 22nov00