Education 231C

Applied Categorical & Nonnormal Data Analysis

Binary Panel Data


Introduction

The binary response variable was created from the data from the 1996 Gregoire, Kumar Everitt, Henderson & Studd study on the efficacy of estrogen patches in treating postnatal depression. Women were randomly assigned to either a placebo control group (group=0, n=27) or estrogen patch group (group=1, n=34). Prior to the first treatment all patients took the Edinburgh Postnatal Depression Scale (EPDS). EPDS data was collected monthly for six months once the treatment began. Depression scores greater than or equal to 11 were coded as 1.
use http://www.ats.ucla.edu/stat/stata/stat130/depres01 

Let the analyses begin

table visit group, cont(mean depressd)

------------------------------
          |       group       
    visit |        0         1
----------+-------------------
        1 | .8518519  .6176471
        2 | .8181818   .516129
        3 | .7058824  .2758621
        4 | .6470588  .2857143
        5 | .5882353  .2142857
        6 | .4705882  .1071429
------------------------------
Let's start off with a random effects longitudinal logit analyses. For the moment we will treat visit as continuous.

xtlogit depressd group visit, i(subj) re

Random-effects logistic regression              Number of obs      =       295
Group variable (i): subj                        Number of groups   =        61

Random effects u_i ~ Gaussian                   Obs per group: min =         1
                                                               avg =       4.8
                                                               max =         6

                                                Wald chi2(2)       =     42.40
Log likelihood  = -134.88312                    Prob > chi2        =    0.0000

------------------------------------------------------------------------------
    depressd |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       group |  -2.692808   .7325381    -3.68   0.000    -4.128557    -1.25706
           t |  -.7267356   .1232912    -5.89   0.000    -.9683819   -.4850893
       _cons |   4.149321   .7270796     5.71   0.000     2.724271    5.574371
-------------+----------------------------------------------------------------
    /lnsig2u |   1.593743   .2930085                      1.019457    2.168029
-------------+----------------------------------------------------------------
     sigma_u |    2.21859   .3250328                      1.664839    2.956525
         rho |   .5993832   .0703581                       .457257    .7265487
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) =    69.18 Prob >= chibar2 = 0.000

Both group and visit are statistically significant. We can also obtain the results in terms of odds ratios

xtlogit, or

Random-effects logistic regression              Number of obs      =       295
Group variable (i): subj                        Number of groups   =        61

Random effects u_i ~ Gaussian                   Obs per group: min =         1
                                                               avg =       4.8
                                                               max =         6

                                                Wald chi2(2)       =     42.40
Log likelihood  = -134.88312                    Prob > chi2        =    0.0000

------------------------------------------------------------------------------
    depressd |         OR   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       group |   .0676906   .0495859    -3.68   0.000     .0161061    .2844892
           t |   .4834847   .0596094    -5.89   0.000     .3796969    .6156422
-------------+----------------------------------------------------------------
    /lnsig2u |   1.593743   .2930085                      1.019457    2.168029
-------------+----------------------------------------------------------------
     sigma_u |    2.21859   .3250328                      1.664839    2.956525
         rho |   .5993832   .0703581                       .457257    .7265487
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) =    69.18 Prob >= chibar2 = 0.000

We can compute the intraclass correlation using the loneway command.

loneway depressd subj

                  One Way Analysis of Variance for depressd: 

                                              Number of obs =       295
                                                  R-squared =    0.5749

    Source              SS         df      MS            F     Prob > F
-----------------------------------------------------------------------
Between subj         42.375141     60    .70625235      5.27     0.0000
Within subj          31.333333    234    .13390313
-----------------------------------------------------------------------
Total                73.708475    294     .2507091

         Intraclass       Asy.        
         correlation      S.E.       [95% Conf. Interval]
         ------------------------------------------------
            0.46987     0.06561       0.34127     0.59846

         Estimated SD of subj effect             .3445006
         Estimated SD within subj                .3659278
         Est. reliability of a subj mean         .8104033
              (evaluated at n=4.82)

Next, we'll recode visit so that it starts with zero. This will set the constant to be the log-odds for the placebo group at the first visit.

replace visit = visit - 1

xtlogit depressd group visit, i(subj) re  

Random-effects logistic regression              Number of obs      =       295
Group variable (i): subj                        Number of groups   =        61

Random effects u_i ~ Gaussian                   Obs per group: min =         1
                                                               avg =       4.8
                                                               max =         6

                                                Wald chi2(2)       =     42.40
Log likelihood  = -134.88312                    Prob > chi2        =    0.0000

------------------------------------------------------------------------------
    depressd |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       group |  -2.692808   .7325381    -3.68   0.000    -4.128557    -1.25706
       visit |  -.7267356   .1232912    -5.89   0.000    -.9683819   -.4850893
       _cons |   3.422586   .6561934     5.22   0.000      2.13647    4.708701
-------------+----------------------------------------------------------------
    /lnsig2u |   1.593743   .2930085                      1.019457    2.168029
-------------+----------------------------------------------------------------
     sigma_u |    2.21859   .3250328                      1.664839    2.956525
         rho |   .5993832   .0703581                       .457257    .7265487
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) =    69.18 Prob >= chibar2 = 0.000
Let's add in the pretest measure of depression, pre and then check to see if the covariate interacts with the treatment.
xtlogit depressd pre group visit, i(subj) re

Random-effects logistic regression              Number of obs      =       295
Group variable (i): subj                        Number of groups   =        61

Random effects u_i ~ Gaussian                   Obs per group: min =         1
                                                               avg =       4.8
                                                               max =         6

                                                Wald chi2(3)       =     43.72
Log likelihood  = -132.59465                    Prob > chi2        =    0.0000

------------------------------------------------------------------------------
    depressd |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         pre |   .2049549   .0930959     2.20   0.028     .0224903    .3874195
       group |  -2.820389   .7282979    -3.87   0.000    -4.247827   -1.392952
       visit |  -.7370056   .1251984    -5.89   0.000      -.98239   -.4916212
       _cons |  -.7833667   1.953497    -0.40   0.688    -4.612151    3.045418
-------------+----------------------------------------------------------------
    /lnsig2u |   1.494296   .3069864                      .8926136    2.095978
-------------+----------------------------------------------------------------
     sigma_u |   2.110971   .3240197                      1.562531     2.85191
         rho |   .5752853   .0750066                      .4259893    .7120027
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) =    57.60 Prob >= chibar2 = 0.000

generate preXgroup = pre*group

xtlogit depressd pre group visit preXgroup, i(subj) re

Random-effects logistic regression              Number of obs      =       295
Group variable (i): subj                        Number of groups   =        61

Random effects u_i ~ Gaussian                   Obs per group: min =         1
                                                               avg =       4.8
                                                               max =         6

                                                Wald chi2(4)       =     43.41
Log likelihood  = -132.59891                    Prob > chi2        =    0.0000

------------------------------------------------------------------------------
    depressd |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         pre |   .2031032   .1385449     1.47   0.143    -.0684399    .4746463
       group |  -2.890287   3.917997    -0.74   0.461    -10.56942    4.788847
       visit |  -.7371763   .1253533    -5.88   0.000    -.9828643   -.4914883
   preXgroup |   .0035231   .1853638     0.02   0.985    -.3597832    .3668295
       _cons |  -.7486509   2.859295    -0.26   0.793    -6.352767    4.855465
-------------+----------------------------------------------------------------
    /lnsig2u |   1.496379    .309774                      .8892332    2.103525
-------------+----------------------------------------------------------------
     sigma_u |   2.113171   .3273027                      1.559892    2.862692
         rho |   .5757942   .0756639                      .4251629    .7135478
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) =    57.54 Prob >= chibar2 = 0.000
Instead of assuming that visit is continuous, we will code it as categorical and then check to see if the categorical version is significantly better than the continuous version.
xi: xtlogit depressd pre group i.visit, i(subj) re
i.visit           _Ivisit_0-5         (naturally coded; _Ivisit_0 omitted)  

Random-effects logistic regression              Number of obs      =       295
Group variable (i): subj                        Number of groups   =        61

Random effects u_i ~ Gaussian                   Obs per group: min =         1
                                                               avg =       4.8
                                                               max =         6

                                                Wald chi2(7)       =     44.39
Log likelihood  = -131.53981                    Prob > chi2        =    0.0000

------------------------------------------------------------------------------
    depressd |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         pre |    .208677   .0940654     2.22   0.027     .0243122    .3930417
       group |  -2.831145   .7381423    -3.84   0.000    -4.277877   -1.384412
   _Ivisit_1 |   -.496379   .5611606    -0.88   0.376    -1.596234    .6034755
   _Ivisit_2 |  -1.944512   .6100314    -3.19   0.001    -3.140152   -.7488727
   _Ivisit_3 |   -2.06131   .6218449    -3.31   0.001    -3.280104   -.8425167
   _Ivisit_4 |  -2.685658   .6541072    -4.11   0.000    -3.967684   -1.403631
   _Ivisit_5 |  -3.871798   .7371506    -5.25   0.000    -5.316586   -2.427009
       _cons |  -.8598885   1.986971    -0.43   0.665    -4.754279    3.034502
-------------+----------------------------------------------------------------
    /lnsig2u |    1.51701   .3098693                      .9096776    2.124343
-------------+----------------------------------------------------------------
     sigma_u |   2.135082   .3307982                      1.575919    2.892645
         rho |   .5808254    .075443                       .430167    .7177839
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) =    58.15 Prob >= chibar2 = 0.000

test _Ivisit_1 _Ivisit_2 _Ivisit_3 _Ivisit_4 _Ivisit_5

 ( 1)  [depressd]_Ivisit_1 = 0
 ( 2)  [depressd]_Ivisit_2 = 0
 ( 3)  [depressd]_Ivisit_3 = 0
 ( 4)  [depressd]_Ivisit_4 = 0
 ( 5)  [depressd]_Ivisit_5 = 0

           chi2(  5) =   35.82
         Prob > chi2 =    0.0000

xtlogit depressd pre group visit _Ivisit_2 _Ivisit_3 _Ivisit_4 _Ivisit_5, i(subj) re

Random-effects logistic regression              Number of obs      =       295
Group variable (i): subj                        Number of groups   =        61

Random effects u_i ~ Gaussian                   Obs per group: min =         1
                                                               avg =       4.8
                                                               max =         6

                                                Wald chi2(7)       =     44.39
Log likelihood  = -131.53981                    Prob > chi2        =    0.0000

------------------------------------------------------------------------------
    depressd |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         pre |    .208677   .0940654     2.22   0.027     .0243122    .3930417
       group |  -2.831145   .7381423    -3.84   0.000    -4.277877   -1.384412
       visit |   -.496379   .5611606    -0.88   0.376    -1.596234    .6034755
   _Ivisit_2 |  -.9517542   .9966142    -0.95   0.340    -2.905082    1.001574
   _Ivisit_3 |  -.5721733   1.504136    -0.38   0.704    -3.520227     2.37588
   _Ivisit_4 |  -.7001415   2.044485    -0.34   0.732    -4.707259    3.306976
   _Ivisit_5 |  -1.389903   2.606625    -0.53   0.594    -6.498794    3.718989
       _cons |  -.8598885   1.986971    -0.43   0.665    -4.754279    3.034502
-------------+----------------------------------------------------------------
    /lnsig2u |    1.51701   .3098693                      .9096776    2.124343
-------------+----------------------------------------------------------------
     sigma_u |   2.135082   .3307982                      1.575919    2.892645
         rho |   .5808254    .075443                       .430167    .7177839
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) =    58.15 Prob >= chibar2 = 0.000

test _Ivisit_2 _Ivisit_3 _Ivisit_4 _Ivisit_5

 ( 1)  [depressd]_Ivisit_2 = 0
 ( 2)  [depressd]_Ivisit_3 = 0
 ( 3)  [depressd]_Ivisit_4 = 0
 ( 4)  [depressd]_Ivisit_5 = 0

           chi2(  4) =    2.05
         Prob > chi2 =    0.7273
By testing the k - 2 dummies from the model that includes the continuous version of visit we see that dummy coding does not provide significantly more information than the continuous variable.

Finally, let's see if there is a group by visit interaction, that is, is the visit effect different for the placebo group than for the estrogen group?

xtlogit depressd pre group visit groupXvisit, i(subj) re 

Random-effects logistic regression              Number of obs      =       295
Group variable (i): subj                        Number of groups   =        61

Random effects u_i ~ Gaussian                   Obs per group: min =         1
                                                               avg =       4.8
                                                               max =         6

                                                Wald chi2(4)       =     43.47
Log likelihood  =  -132.3596                    Prob > chi2        =    0.0000

------------------------------------------------------------------------------
    depressd |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         pre |   .2038696   .0927866     2.20   0.028     .0220112    .3857281
       group |  -2.465363   .8971828    -2.75   0.006    -4.223809   -.7069171
       visit |   -.640613   .1889034    -3.39   0.001    -1.010857   -.2703692
 groupXvisit |  -.1616086   .2408169    -0.67   0.502     -.633601    .3103838
       _cons |  -1.004168   1.979193    -0.51   0.612    -4.883315    2.874979
-------------+----------------------------------------------------------------
    /lnsig2u |   1.499876   .3141927                      .8840696    2.115682
-------------+----------------------------------------------------------------
     sigma_u |   2.116869   .3325524                       1.55587    2.880147
         rho |   .5766481   .0767023                      .4239014    .7160262
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) =    57.45 Prob >= chibar2 = 0.000
The answer to our last question is, no, there is no group by visit interaction. So, here is our final model:
Random-effects logistic regression              Number of obs      =       295
Group variable (i): subj                        Number of groups   =        61

Random effects u_i ~ Gaussian                   Obs per group: min =         1
                                                               avg =       4.8
                                                               max =         6

                                                Wald chi2(3)       =     43.72
Log likelihood  = -132.59465                    Prob > chi2        =    0.0000

------------------------------------------------------------------------------
    depressd |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         pre |   .2049549   .0930959     2.20   0.028     .0224903    .3874195
       group |  -2.820389   .7282979    -3.87   0.000    -4.247827   -1.392952
       visit |  -.7370056   .1251984    -5.89   0.000      -.98239   -.4916212
       _cons |  -.7833667   1.953497    -0.40   0.688    -4.612151    3.045418
-------------+----------------------------------------------------------------
    /lnsig2u |   1.494296   .3069864                      .8926136    2.095978
-------------+----------------------------------------------------------------
     sigma_u |   2.110971   .3240197                      1.562531     2.85191
         rho |   .5752853   .0750066                      .4259893    .7120027
------------------------------------------------------------------------------
Women who were higher on the pretest of depression are more likely to be classified as depressed during the follow up visits. Women in the estrogen group are significantly less likely to be classified as depressed. And, the log-odds of being classified as depressed go down over time regardless of which group the women were place in.

We at population averaged models using xtgee in a later unit.


Categorical Data Analysis Course

Phil Ender