Education 231C

Applied Categorical & Nonnormal Data Analysis

Ordered Logit & Probit Models


When the response variable is ordinal and has more than two levels, researchers have a choice between ordered logistic regression (ordered logit) and ordered probit models. A representation of the latent variable approach of an ordered variable might look like this.

      -inf                                                +inf
      <-----+-----------+--------------------------+--------->  y*
      <  1  |     2     |           3              |    4    >  y      
            τ1         τ2                         τ3
Here is the rule we can use to relate the latent observations to our ordinal response variable. The structural model is We can now express the model in terms of probabilities. And now in terms of odds. The log likelihood function for ordered logistic regression is Example 1

Let's begin our examination of ordered logistic regression using the honors dataset with the binary response variable honors composition (honors). We begin with an ordinary logistic regression.

Next, we will run the ordered logistic regression command, ologit, for the same model. We see that the values of the coefficients are the same, except that, the sign for _cut1 is reversed. We will explain shorty what _cut1 is although it is already clear that it is related to the constant found in the logistic regression models.

Example 2

For our next example we will select ses as the response variable from the dataset hsb2. Ses has three ordered categories. Here are the frequencies for each of the categories.

We can also obtain much of the same information using the codebook command. For a predictor variable we will create a dummy variable academic which indicates whether or not students are in an academic program. Here is the ordered logistic model predicting ses using academic. The format of these results may seem confusing at first. What isn't clear from the output is that logistic regression is a multiequation model. In this example, there are two equations, each with the same logistic coefficients. This is known as the proportional odds model. Other logistics regression models, which do not assume proportional odds will have one equation, with their own constants and coefficients, for each of the k-1 equations.

In our example, the results are formatted like a single equation model when, in fact, this are two equations in the model because there are three levels of ses. In ordered logistic regression, Stata sets the constant to zero and estimates the cut points for separating the various levels of the response variable. Other programs may parameterize the model differently by estimating the constant and setting the first cut point to zero.

SAS formats ordered logit models in a similar manner.

Data Set                      WORK.OLOG       
Response Variable             ses             
Number of Response Levels     3               
Number of Observations        200             
Link Function                 Logit           
Optimization Technique        Fisher's scoring


          Response Profile
 
 Ordered                      Total
   Value          ses     Frequency

       1            1            47
       2            2            95
       3            3            58


                    Model Convergence Status

         Convergence criterion (GCONV=1E-8) satisfied.          


Score Test for the Proportional Odds Assumption
 
Chi-Square       DF     Pr > ChiSq

    2.0046        1         0.1568


         Model Fit Statistics
 
                              Intercept
               Intercept         and   
Criterion        Only        Covariates

AIC              425.165        415.330
SC               431.762        425.225
-2 Log L         421.165        409.330


        Testing Global Null Hypothesis: BETA=0
 
Test                 Chi-Square       DF     Pr > ChiSq

Likelihood Ratio        11.8350        1         0.0006
Score                   11.6374        1         0.0006
Wald                    11.4526        1         0.0007

              Analysis of Maximum Likelihood Estimates
 
                                Standard
Parameter     DF    Estimate       Error    Chi-Square    Pr > ChiSq

Intercept      1     -0.7643      0.2072       13.6032        0.0002
Intercept2     1      1.4146      0.2282       38.4156        <.0001
academic       1     -0.9299      0.2748       11.4526        0.0007


            Odds Ratio Estimates
                      
               Point          95% Wald
Effect      Estimate      Confidence Limits

academic       0.395       0.230       0.676


Association of Predicted Probabilities and Observed Responses

Percent Concordant     35.8    Somers' D    0.203
Percent Discordant     15.6    Gamma        0.394
Percent Tied           48.6    Tau-a        0.129
Pairs                 12701    c            0.601
With ordered logistic regression there are other possible estimation procedures that do not involve the proportional odds assumption. Use the brant (findit brant -- one of the Long & Freese utilities) command to test the proportional odds assumption. These results suggest that the proportional odds approach is reasonable. If the test of proportionality had been significant we could have tried the gologit program by Vincent Kang Fu from UCLA [now at the University of Utah] (findit gologit). gologit which stands for generalized ordered logit does not assume proportional odds, let's try it just for "fun." These results clearly show the multiple equation nature of ordered logistic regression with different constants and coefficients.

The gologit command provides us with an alternative method for testing the proportionality assumption. If the assumption of proportional odds is tenable then there should not be a significant difference between the coefficients for academic in the two equations. The test command computes a Wald test across the two equations.

The results of the Wald test of proportionality are very similar to those found using the omodel command.

Let's rerun the ologit command followed by the listcoef and fitstat commands.

From the listcoef, we see that the relative risk ratio for academic is approximately 2.5, which means that the risk (odds) of being in the high ses versus medium and low ses is 2.5 times greater for students in the academic program. The same relative risk ratio also applies to the comparison of medium and high ses versus low ses.

Example 3

This example makes use of the dataset apcomp.dta. The variable apcomp contains the advanced placement composition score. Although ap scores can run from one to five our sample has no observations lower than two. Many colleges require a minimum score of three in order to count the ap course while some college require a minimum of four. The other variables in the file are female (1 if female), honors (1 if enrolled in any honors courses), and standardized test scores for reading, writing and logic (normed with mean=50 and sd=10).

In ordered logistic regression, Stata sets the constant to zero and estimates the cut points for separating the various levels of the response variable. Other programs parameterize the model differently by estimating the constant and setting the first cut point to zero.

Remember that ordered logistic regression is a multiequation model. In this example, there are three equations, each with the same coefficients. This is a result of using the proportional odds model. Other logistics regression models, which do not assume proportional odds will have an equation (with constants and coefficients) for each of the k-1 equations.

Let's compare the results of the ordered logit with an ordered probit analysis.

The ordered probit is quite similar to the ordered logit with the ordered logit coefficients being scaled about 1.7 times larger. Notice that the z-tests and p-values are quite similar.

In fact, the results and interpretation of ordered logit and probit are so similar that we will focus on the ordered logit which is a bit more common and because the exponentiated coefficients in ordered logistic regression have a useful interpretation.

Now back to the ordered logit example.

Since the _hatsq variable is not statistically significant this model passes the link test.

Next, we will check the proportional odds assumption using the brant command (findit brant).

The chi-square test of proportional odds is not significant, suggesting that the proportional odds assumptions holds for this model.

If we had found that the proportional odds assumption was not being met we could use the gologit command (findit gologit). We will go ahead and demonstrate gologit again even though it isn't needed.

At this point it might be interesting to run the model using multinomial logistic regression to see how the coefficients differ when the information concerning the ordering of the categories is ignored. mlogit models the four levels of apcomp but does not consider the order to be relevant. Example 4

This example from Richard Williams (Notre Dame) will allow us to investigate the use of the gologit2 command (findit gologit2). gologit2 allows for several different options for relaxing the proportional odds assumption for all or a selected subset of the predictors.

gologit2 was written by Richard Williams (2005).

Okay, we know that the proportional odds assumption does not hold for this model. And we know further that the variables yr89 and male are the major offenders along with possibly age. So, we will run three different gologit2 models saving information on each one to compare them. Because Model 3 is significantly different from Model 1 and not significantly different from Model 2 we will go with Model 3 in which the proportionality assumption holds for all variables except for yr89 and male. There is no need to relax the proportionality assumption for age.

Finally, we will rerun the last model using the gamma parameterization.

gologit2, gamma

(output omitted)

Alternative parameterization: Gammas are deviations from proportionality
------------------------------------------------------------------------------
        warm |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Beta         |
        yr89 |     .98368   .1530091     6.43   0.000     .6837876    1.283572
        male |  -.3328209   .1275129    -2.61   0.009    -.5827417   -.0829002
       white |  -.3832583   .1184635    -3.24   0.001    -.6154424   -.1510742
         age |  -.0216325   .0024751    -8.74   0.000    -.0264835   -.0167814
          ed |   .0670703   .0161311     4.16   0.000     .0354539    .0986866
        prst |   .0059146   .0033158     1.78   0.074    -.0005843    .0124135
-------------+----------------------------------------------------------------
Gamma_2      |
        yr89 |   -.449311   .1465627    -3.07   0.002    -.7365686   -.1620533
        male |  -.3604562   .1233732    -2.92   0.003    -.6022633   -.1186492
-------------+----------------------------------------------------------------
Gamma_3      |
        yr89 |  -.6578702   .1768034    -3.72   0.000    -1.004399   -.3113418
        male |  -.7647937   .1631536    -4.69   0.000    -1.084569   -.4450186
-------------+----------------------------------------------------------------
Alpha        |
     _cons_1 |    2.12173   .2467146     8.60   0.000     1.638178    2.605282
     _cons_2 |   .6021625   .2358361     2.55   0.011     .1399323    1.064393
     _cons_3 |  -1.048137   .2393568    -4.38   0.000    -1.517268   -.5790061
------------------------------------------------------------------------------
The alternative gamma parameterization presents an equivalent parameterization of the gologit model, called the unconstrained partial proportional odds model. The model has one ordered logistic coefficient, beta, for each predictor, M-2 gamma coefficients representing deviations from proportionality (where M equals the number of categories in the response variable), and M-1 alpha coefficients reflecting the cut points.

The gamma_2 value for yr89 (-.449311) is added to beta (.98368) yielding the value for the coefficient in equation D above (.534369 = .98368 - .449311). The same process is used to get the coefficident for yr89 in equation A above (.3258098 = .98368 - .6578702).

This gamma parameterization combines the best of the traditional ologit output while allowing for nonproportionality in some or all of the variables in the model.


Categorical Data Analysis Course

Phil Ender -- 7mar06, 12may05