Applied Categorical & Nonnormal Data Analysis

Logistic Perfect Prediction

You would think that predicting a binary response variable perfectly would be a "good thing," but it can create problems in estimating logistic models as these examples demonstrate.
Examples
use http://www.gseis.ucla.edu/courses/data/honors, clear



logit honors female, nolog

Logit estimates                                   Number of obs   =        200
                                                  LR chi2(1)      =       3.94
                                                  Prob > chi2     =     0.0473
Log likelihood =  -113.6769                       Pseudo R2       =     0.0170

------------------------------------------------------------------------------
      honors |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      female |   .6513707   .3336752     1.95   0.051    -.0026207    1.305362
       _cons |  -1.400088   .2631619    -5.32   0.000    -1.915876   -.8842998
------------------------------------------------------------------------------

logit honors female lang, nolog

Logit estimates                                   Number of obs   =        200
                                                  LR chi2(2)      =      60.40
                                                  Prob > chi2     =     0.0000
Log likelihood =  -85.44372                       Pseudo R2       =     0.2612

------------------------------------------------------------------------------
      honors |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      female |   1.120926   .4081028     2.75   0.006      .321059    1.920793
        lang |   .1443657   .0233337     6.19   0.000     .0986325    .1900989
       _cons |  -9.603365   1.426404    -6.73   0.000    -12.39906   -6.807665
------------------------------------------------------------------------------

generate h2 = honors
replace h2 = 0 if ~female

logit h2 female

note: female != 1 predicts failure perfectly
      female dropped and 91 obs not used


Logit estimates                                   Number of obs   =        109
                                                  LR chi2(0)      =       0.00
                                                  Prob > chi2     =          .
Log likelihood =  -68.41892                       Pseudo R2       =     0.0000

------------------------------------------------------------------------------
          h2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   -.748717   .2051461    -3.65   0.000    -1.150796    -.346638
------------------------------------------------------------------------------    

logit h2 female lang, nolog

note: female != 1 predicts failure perfectly
      female dropped and 91 obs not used


Logit estimates                                   Number of obs   =        109
                                                  LR chi2(1)      =      40.59
                                                  Prob > chi2     =     0.0000
Log likelihood = -48.121483                       Pseudo R2       =     0.2967

------------------------------------------------------------------------------
          h2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        lang |   .1625339   .0319694     5.08   0.000      .099875    .2251928
       _cons |  -9.476008   1.768424    -5.36   0.000    -12.94206    -6.00996
------------------------------------------------------------------------------


generate h3 = honors
replace h3 = 0 if lang<50
replace h3 = 1 if lang>=50

logit h3 lang, nolog 
lang > 48 predicts data perfectly

logit h3 female lang, nolog    
lang > 48 predicts data perfectly
Categorical Data Analysis Course
Phil Ender