After laying all of the theoretical foundation for logistic regression, it must be admitted that for many models there is very little difference between the OLS results and the logistic regression results. Here is a small example in which there doesn't seem to be much difference between OLS and logit.
First Example
use http://www.gseis.ucla.edu/courses/data/honors regress honors lang female Source | SS df MS Number of obs = 200 -------------+------------------------------ F( 2, 197) = 35.85 Model | 10.3957196 2 5.19785982 Prob > F = 0.0000 Residual | 28.5592804 197 .144970966 R-squared = 0.2669 -------------+------------------------------ Adj R-squared = 0.2594 Total | 38.955 199 .195753769 Root MSE = .38075 ------------------------------------------------------------------------------ honors | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- lang | .0214989 .0026362 8.16 0.000 .0163001 .0266977 female | .1467375 .054142 2.71 0.007 .0399652 .2535098 _cons | -.9378584 .1448623 -6.47 0.000 -1.223538 -.6521786 ------------------------------------------------------------------------------ predict p1 logit honors lang female, nolog Logit estimates Number of obs = 200 LR chi2(2) = 60.40 Prob > chi2 = 0.0000 Log likelihood = -85.44372 Pseudo R2 = 0.2612 ------------------------------------------------------------------------------ honors | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lang | .1443657 .0233337 6.19 0.000 .0986325 .1900989 female | 1.120926 .4081028 2.75 0.006 .321059 1.920793 _cons | -9.603365 1.426404 -6.73 0.000 -12.39906 -6.807665 ------------------------------------------------------------------------------ predict p2 summarize p1 p2 Variable | Obs Mean Std. Dev. Min Max -------------+----------------------------------------------------- p1 | 200 .265 .2285603 -.2713931 .8427939 p2 | 200 .265 .2408362 .0058933 .9233922 corr p1 p2 (obs=200) | p1 p2 -------------+------------------ p1 | 1.0000 p2 | 0.9490 1.0000 list p1 p2 in 1/20 p1 p2 1. .0473354 .0545689 2. -.1423999 .0139005 3. -.1638987 .0120544 4. .283823 .2202598 5. .5633085 .6485339 6. .0725889 .0563498 7. -.0386602 .0313819 8. .1763286 .1206826 9. .0473354 .0545689 10. -.1891523 .0116561 11. .3268208 .2738011 12. .1800833 .1094523 13. .0295912 .0428232 14. -.060159 .0272784 15. .1548298 .106182 16. .2193264 .1548247 17. .24458 .1593262 18. .2193264 .1548247 19. .4343152 .4369393 20. .1548298 .106182 /* classification tables */ generate c1 = p1>.5 generate c2 = p2>.5 tabulate c1 c2 | c2 c1 | 0 1 | Total -----------+----------------------+---------- 0 | 164 4 | 168 1 | 0 32 | 32 -----------+----------------------+---------- Total | 164 36 | 200
Note the out-of range predictions, negative values, in the example above.
Next, let's look at a counter example in which OLS and logistic produce different results.
Counter Example
use http://www.gseis.ucla.edu/courses/data/apilog, clear regress hiqual enroll meals avg_ed Source | SS df MS Number of obs = 1149 -------------+------------------------------ F( 3, 1145) = 522.92 Model | 145.625156 3 48.5417186 Prob > F = 0.0000 Residual | 106.287812 1145 .092827783 R-squared = 0.5781 -------------+------------------------------ Adj R-squared = 0.5770 Total | 251.912968 1148 .219436383 Root MSE = .30468 ------------------------------------------------------------------------------ hiqual | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- enroll | .0000233 .0000521 0.45 0.654 -.0000789 .0001255 meals | -.0077637 .0005286 -14.69 0.000 -.0088009 -.0067266 avg_ed | .1697195 .0210965 8.04 0.000 .1283274 .2111116 _cons | .2513082 .083063 3.03 0.003 .0883353 .414281 ------------------------------------------------------------------------------ predict p1 (51 missing values generated) logit hiqual enroll meals avg_ed, nolog Logit estimates Number of obs = 1149 LR chi2(3) = 917.65 Prob > chi2 = 0.0000 Log likelihood = -265.40191 Pseudo R2 = 0.6335 ------------------------------------------------------------------------------ hiqual | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- enroll | -.0019593 .000735 -2.67 0.008 -.0033999 -.0005187 meals | -.0785112 .0076189 -10.30 0.000 -.093444 -.0635784 avg_ed | 2.148565 .299792 7.17 0.000 1.560984 2.736147 _cons | -3.302163 1.030206 -3.21 0.001 -5.32133 -1.282996 ------------------------------------------------------------------------------ predict p2 (51 missing values generated) summarize p1 p2 Variable | Obs Mean Std. Dev. Min Max -------------+----------------------------------------------------- p1 | 1149 .3246301 .3561617 -.3461998 1.101522 p2 | 1149 .3246301 .3848081 .000036 .9986064 corr p1 p2 (obs=1149) | p1 p2 -------------+------------------ p1 | 1.0000 p2 | 0.9256 1.0000 list p1 p2 in 1/20 p1 p2 1. .0686489 .0087564 2. -.2424376 .0002072 3. .4535227 .348036 4. .5269313 .4345559 5. .9753819 .9948776 6. -.0621362 .0028704 7. -.0452434 .0019718 8. .4313931 .2918692 9. .2042937 .0150685 10. .7003989 .8609163 11. . . 12. -.0974959 .0009283 13. .5642942 .6845942 14. .1766382 .0204579 15. .4774918 .3685073 16. -.1551605 .0002245 17. .8235874 .9654739 18. .0041572 .0047186 19. -.0109633 .0019114 20. -.1417836 .0005978 /* classification tables */ generate c1 = p1>.5 if p1~=. generate c2 = p2>.5 if p1~=. tabulate c1 c2 | c2 c1 | 0 1 | Total -----------+----------------------+---------- 0 | 750 0 | 750 1 | 30 369 | 399 -----------+----------------------+---------- Total | 780 369 | 1149
Categorical Data Analysis Course
Phil Ender