Ed231C: Zero-truncated Count Models

Applied Categorical & Nonnormal Data Analysis

Zero-truncated Count Models

In this unit we will encounter the opposite situation from the zero-inflated models, we will look at data that have no zeros, the so called zero-truncated models. If one tries to use standard poisson or negative binomial analysis with these kinds of data the procedures try to fit the models by including probabilities for zero values. One should be able to produce more accurate models by using a probability model that does not include the zero values.

We will illustrate zero-truncated count models examining length of hospital stay (los) from the 1997 MedPar dataset. Length of stay does not and cannot have any zero values. Length of stay begins with a value of one and grows from there.

Stata 9 introduced two new commands ztp for zero-truncated poisson and ztnb for zero-truncated negative binomial. We will use both of these commands in this unit.

Note: The commands trpois0 and trnbin0 ado's and the medpar dataset were taken from a Stata Technical article (STB-47, January 1999) by Joseph Hilbe of Arizona State University can be used with Stata 8 and below.

Looking at the Data

The response variable in this example is length of hospital stay. With length of hospital stay, regardless of how little time is spent in the hospital, patients are credited as having at least one day.

use http://www.gseis.ucla.edu/courses/data/medpar, clear

describe

Contains data from medpar.dta
  obs:         1,495                          
 vars:            10                          30 Jun 1998 13:10
 size:        43,355 (98.6% of memory free)
-------------------------------------------------------------------------------
              storage  display     value
variable name   type   format      label      variable label
-------------------------------------------------------------------------------
provnum         str6   %9s                    Provider number
died            float  %9.0g                  
white           float  %9.0g                  
hmo             byte   %9.0g                  HMO/readmit'
los             int    %9.0g                  Length of Stay
age80           float  %9.0g                  
age             byte   %9.0g                  Age Group
type1           byte   %8.0g                  type==     1.0000
type2           byte   %8.0g                  type==     2.0000
type3           byte   %8.0g                  type==     3.0000
-------------------------------------------------------------------------------

summarize

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
     provnum |         0
        died |      1495    .3431438    .4749179          0          1
       white |      1495    .9150502    .2789003          0          1
         hmo |      1495    .1598662    .3666046          0          1
         los |      1495    9.854181    8.832906          1        116
-------------+--------------------------------------------------------
       age80 |      1495    .2207358    .4148815          0          1
         age |      1495    5.235452    1.668898          1          9
       type1 |      1495    .7585284    .4281187          0          1
       type2 |      1495    .1772575    .3820143          0          1
       type3 |      1495     .064214    .2452159          0          1
 
tabstat los, stat(n mean sd var)

    variable |         N      mean        sd  variance
-------------+----------------------------------------
         los |      1495  9.854181  8.832906  78.02022
------------------------------------------------------

/* note: mean and variance are very different */

tabulate los

  Length of |
       Stay |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |        126        8.43        8.43
          2 |         71        4.75       13.18
          3 |         75        5.02       18.19
          4 |        104        6.96       25.15
          5 |        123        8.23       33.38
          6 |         97        6.49       39.87
          7 |        116        7.76       47.63
          8 |         92        6.15       53.78
          9 |         74        4.95       58.73
         10 |         89        5.95       64.68
         11 |         70        4.68       69.36
         12 |         70        4.68       74.05
         13 |         43        2.88       76.92
         14 |         49        3.28       80.20
         15 |         41        2.74       82.94
         16 |         43        2.88       85.82
         17 |         29        1.94       87.76
         18 |         23        1.54       89.30
         19 |         24        1.61       90.90
         20 |         19        1.27       92.17
         21 |         18        1.20       93.38
         22 |         15        1.00       94.38
         23 |         10        0.67       95.05
         24 |         11        0.74       95.79
         25 |          4        0.27       96.05
         26 |          7        0.47       96.52
         27 |          7        0.47       96.99
         28 |          5        0.33       97.32
         29 |          3        0.20       97.53
         30 |          1        0.07       97.59
         31 |          2        0.13       97.73
         32 |          6        0.40       98.13
         33 |          2        0.13       98.26
         34 |          5        0.33       98.60
         36 |          1        0.07       98.66
         42 |          1        0.07       98.73
         43 |          1        0.07       98.80
         44 |          2        0.13       98.93
         46 |          3        0.20       99.13
         48 |          1        0.07       99.20
         49 |          1        0.07       99.26
         50 |          1        0.07       99.33
         52 |          1        0.07       99.40
         57 |          1        0.07       99.46
         59 |          1        0.07       99.53
         60 |          1        0.07       99.60
         63 |          1        0.07       99.67
         65 |          1        0.07       99.73
         70 |          1        0.07       99.80
         74 |          1        0.07       99.87
         91 |          1        0.07       99.93
        116 |          1        0.07      100.00
------------+-----------------------------------
      Total |      1,495      100.00

nbvargr los, n(15)

Obtaining Parameter Estimates

(36 observations deleted)
here

  Negative Binomial Probabilities
  with mean = 9.854181 & overdispersion = .4902339

     +------------------------------+
     |  k       nbprob        nbcum |
     |------------------------------|
  1. |  0   0.02741744   0.02741744 |
  2. |  1   0.04633566   0.07375310 |
  3. |  2   0.05834830   0.13210140 |
  4. |  3   0.06509732   0.19719872 |
  5. |  4   0.06795350   0.26515222 |
     |------------------------------|
  6. |  5   0.06800788   0.33316010 |
  7. |  6   0.06610931   0.39926943 |
  8. |  7   0.06290771   0.46217713 |
  9. |  8   0.05889338   0.52107054 |
 10. |  9   0.05443054   0.57550102 |
     |------------------------------|
 11. | 10   0.04978486   0.62528592 |
 12. | 11   0.04514578   0.67043167 |
 13. | 12   0.04064433   0.71107602 |
 14. | 13   0.03636726   0.74744326 |
 15. | 14   0.03236813   0.77981138 |
     |------------------------------|
 16. | 15   0.02867597   0.80848736 |
     +------------------------------+
k was int now float

 Poisson Probabilities for lambda = 9.854181

     +------------------------------+
     |  k        pprob         pcum |
     |------------------------------|
  1. |  0   0.00005253   0.00005253 |
  2. |  1   0.00051761   0.00057014 |
  3. |  2   0.00255032   0.00312046 |
  4. |  3   0.00837710   0.01149756 |
  5. |  4   0.02063738   0.03213494 |
     |------------------------------|
  6. |  5   0.04067289   0.07280783 |
  7. |  6   0.06679966   0.13960749 |
  8. |  7   0.09403657   0.23364405 |
  9. |  8   0.11583167   0.34947574 |
 10. |  9   0.12682514   0.47630087 |
     |------------------------------|
 11. | 10   0.12497579   0.60127664 |
 12. | 11   0.11195764   0.71323431 |
 13. | 12   0.09193757   0.80517185 |
 14. | 13   0.06968996   0.87486184 |
 15. | 14   0.04905268   0.92391449 |
     |------------------------------|
 16. | 15   0.03222493   0.95613945 |
     +------------------------------+
(1 observation deleted)

Tricking Stata

Its clear from the nbvargr that neither the poisson and negative binomial distributions fit the observed data very well. Also, the negative binomial distribution expects that there will be some (approximately 40) zero values.

We will run standard poisson and negative binomial regressions and then we will trick Stata by subtracting one from the value of length of stay and rerunning these models.

poisson los died hmo type2 type3, nolog cluster(provnum)

Poisson regression                                Number of obs   =       1495
                                                  Wald chi2(4)    =      30.71
Log pseudolikelihood = -6846.9485                 Prob > chi2     =     0.0000

                               (Std. Err. adjusted for 54 clusters in provnum)
------------------------------------------------------------------------------
             |               Robust
         los |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        died |  -.2483158   .0633446    -3.92   0.000    -.3724689   -.1241627
         hmo |  -.0753708   .0502589    -1.50   0.134    -.1738764    .0231348
       type2 |   .2498558   .0646699     3.86   0.000     .1231051    .3766066
       type3 |   .7501452   .2184939     3.43   0.001     .3219049    1.178385
       _cons |   2.264575   .0335312    67.54   0.000     2.198855    2.330295
------------------------------------------------------------------------------

/* compute aic */
display (-2*-6846.948518+2*4)/1495

9.1651485

nbreg los died hmo type2 type3, nolog cluster(provnum)

Negative binomial regression                      Number of obs   =       1495
Dispersion           = mean                       Wald chi2(4)    =      36.13
Log pseudolikelihood = -4782.5989                 Prob > chi2     =     0.0000

                               (Std. Err. adjusted for 54 clusters in provnum)
------------------------------------------------------------------------------
             |               Robust
         los |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        died |   -.236979   .0573431    -4.13   0.000    -.3493694   -.1245886
         hmo |  -.0705928    .049645    -1.42   0.155    -.1678953    .0267097
       type2 |   .2532097   .0634972     3.99   0.000     .1287575    .3776619
       type3 |   .7365274   .2115372     3.48   0.000     .3219221    1.151133
       _cons |   2.260834   .0327839    68.96   0.000     2.196578    2.325089
-------------+----------------------------------------------------------------
    /lnalpha |  -.8318959   .0634521                     -.9562597    -.707532
-------------+----------------------------------------------------------------
       alpha |   .4352234   .0276158                      .3843277    .4928591
------------------------------------------------------------------------------

/* compute aic */
display (-2*-4782.5989+2*4)/1495

6.4034768

/* create new variable with zero */

generate newlos = los - 1

histogram newlos, discrete



nbvargr newlos, n(15)

Obtaining Parameter Estimates

(36 observations deleted)
here

  Negative Binomial Probabilities
  with mean = 8.854181 & overdispersion = .7120889

     +------------------------------+
     |  k       nbprob        nbcum |
     |------------------------------|
  1. |  0   0.06126391   0.06126391 |
  2. |  1   0.07425659   0.13552050 |
  3. |  2   0.07704805   0.21256854 |
  4. |  3   0.07546319   0.28803173 |
  5. |  4   0.07171640   0.35974813 |
     |------------------------------|
  6. |  5   0.06690429   0.42665243 |
  7. |  6   0.06163682   0.48828924 |
  8. |  7   0.05627193   0.54456115 |
  9. |  8   0.05102334   0.59558451 |
 10. |  9   0.04601700   0.64160150 |
     |------------------------------|
 11. | 10   0.04132344   0.68292499 |
 12. | 11   0.03697751   0.71990246 |
 13. | 12   0.03299088   0.75289333 |
 14. | 13   0.02936026   0.78225362 |
 15. | 14   0.02607288   0.80832648 |
     |------------------------------|
 16. | 15   0.02311026   0.83143675 |
     +------------------------------+

 Poisson Probabilities for lambda = 8.854181

     +------------------------------+
     |  k        pprob         pcum |
     |------------------------------|
  1. |  0   0.00014278   0.00014278 |
  2. |  1   0.00126423   0.00140701 |
  3. |  2   0.00559686   0.00700388 |
  4. |  3   0.01651855   0.02352243 |
  5. |  4   0.03656456   0.06008700 |
     |------------------------------|
  6. |  5   0.06474985   0.12483685 |
  7. |  6   0.09555116   0.22038800 |
  8. |  7   0.12086103   0.34124902 |
  9. |  8   0.13376568   0.47501472 |
 10. |  9   0.13159840   0.60661310 |
     |------------------------------|
 11. | 10   0.11651960   0.72313273 |
 12. | 11   0.09378960   0.81692231 |
 13. | 12   0.06920251   0.88612479 |
 14. | 13   0.04713320   0.93325800 |
 15. | 14   0.02980899   0.96306700 |
     |------------------------------|
 16. | 15   0.01759561   0.98066264 |
     +------------------------------+
(0 observations deleted)



poisson newlos died hmo type2 type3, nolog cluster(provnum)

Poisson regression                                Number of obs   =       1495
                                                  Wald chi2(4)    =      31.42
Log pseudolikelihood = -7229.6375                 Prob > chi2     =     0.0000

                               (Std. Err. adjusted for 54 clusters in provnum)
------------------------------------------------------------------------------
             |               Robust
      newlos |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        died |   -.277442   .0703259    -3.95   0.000    -.4152782   -.1396057
         hmo |  -.0849026    .056734    -1.50   0.135    -.1960993    .0262941
       type2 |   .2778412    .071253     3.90   0.000     .1381879    .4174945
       type3 |   .8166476   .2318683     3.52   0.000     .3621941    1.271101
       _cons |   2.153754   .0372609    57.80   0.000     2.080724    2.226784
------------------------------------------------------------------------------

/* compute aic */
display (-2*-7229.6375+2*4)/1495

9.677107

nbreg newlos died hmo type2 type3, nolog cluster(provnum)

Negative binomial regression                      Number of obs   =       1495
Dispersion           = mean                       Wald chi2(4)    =      37.00
Log pseudolikelihood = -4742.6087                 Prob > chi2     =     0.0000

                               (Std. Err. adjusted for 54 clusters in provnum)
------------------------------------------------------------------------------
             |               Robust
      newlos |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        died |  -.2650532   .0644419    -4.11   0.000    -.3913571   -.1387494
         hmo |  -.0793184      .0563    -1.41   0.159    -.1896643    .0310275
       type2 |   .2826808    .069884     4.05   0.000     .1457107    .4196509
       type3 |   .8011306    .224282     3.57   0.000      .361546    1.240715
       _cons |   2.149526   .0365384    58.83   0.000     2.077912     2.22114
-------------+----------------------------------------------------------------
    /lnalpha |   -.448078   .0559217                     -.5576824   -.3384736
-------------+----------------------------------------------------------------
       alpha |   .6388549   .0357258                      .5725344    .7128576
------------------------------------------------------------------------------

/* compute aic */
display (-2*-4742.6087+2*4)/1495

6.3499782

/* Summary Table  
      variable  model     log likelihood     aic
      los       poisson     -6846.9485       9.1651485
      los       nbreg       -4782.5989       6.4034768
      newlos    poisson     -7229.6375       9.677107
      newlos    nbreg       -4742.6087       6.3499782 */

The negative binomial regression with the trick is only slightly better and the poisson regression with the trick is actually worse.

Zero-truncated Poisson

We will begin the zero-truncated models with a zero-truncated poisson regression even though it is unlikely that a poisson distribution will be appropriate for these data since the mean and variance of los are nowhere near equal.

ztp los died hmo type2 type3, nolog cluster(provnum) 
 
Zero-truncated Poisson regression                 Number of obs   =       1495
                                                  Wald chi2(4)    =      30.68
Log pseudolikelihood = -6846.6528                 Prob > chi2     =     0.0000

                               (Std. Err. adjusted for 54 clusters in provnum)
------------------------------------------------------------------------------
             |               Robust
         los |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        died |   -.248681   .0634856    -3.92   0.000    -.3731105   -.1242514
         hmo |  -.0755112   .0503728    -1.50   0.134    -.1742401    .0232177
       type2 |   .2500681   .0647042     3.86   0.000     .1232501     .376886
       type3 |   .7503999   .2185408     3.43   0.001     .3220678    1.178732
       _cons |   2.264474   .0335532    67.49   0.000     2.198711    2.330237
------------------------------------------------------------------------------

/* compute aic */
display (-2*-6846.6528+2*4)/1495

9.1647529

ztp, irr

Zero-truncated Poisson regression                 Number of obs   =       1495
                                                  Wald chi2(4)    =      30.68
Log pseudolikelihood = -6846.6528                 Prob > chi2     =     0.0000

                               (Std. Err. adjusted for 54 clusters in provnum)
------------------------------------------------------------------------------
             |               Robust
         los |        IRR   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        died |   .7798287   .0495079    -3.92   0.000     .6885891    .8831577
         hmo |   .9272693   .0467091    -1.50   0.134     .8400952    1.023489
       type2 |   1.284113   .0830875     3.86   0.000     1.131167    1.457738
       type3 |   2.117847    .462836     3.43   0.001     1.379978    3.250251
------------------------------------------------------------------------------

Zero-truncated Negative Binomial

ztnb los died hmo type2 type3, nolog cluster(provnum) 

Zero-truncated negative binomial regression       Number of obs   =       1495
Dispersion     = mean                             Wald chi2(4)    =      36.01
Log likelihood = -4737.535                        Prob > chi2     =     0.0000

                               (Std. Err. adjusted for 54 clusters in provnum)
------------------------------------------------------------------------------
             |               Robust
         los |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        died |  -.2521884    .061533    -4.10   0.000    -.3727908   -.1315859
         hmo |  -.0754173   .0533132    -1.41   0.157    -.1799091    .0290746
       type2 |   .2685095   .0666474     4.03   0.000      .137883    .3991359
       type3 |   .7668101   .2183505     3.51   0.000      .338851    1.194769
       _cons |   2.224028    .034727    64.04   0.000     2.155964    2.292091
-------------+----------------------------------------------------------------
    /lnalpha |   -.630108   .0764019                      -.779853    -.480363
-------------+----------------------------------------------------------------
       alpha |   .5325343   .0406866                      .4584734    .6185588
------------------------------------------------------------------------------

/* compute aic */
display (-2*-4782.5989+2*4)/1495

6.4034768

ztnb, irr

Zero-truncated negative binomial regression       Number of obs   =       1495
Dispersion     = mean                             Wald chi2(4)    =      36.01
Log likelihood = -4737.535                        Prob > chi2     =     0.0000

                               (Std. Err. adjusted for 54 clusters in provnum)
------------------------------------------------------------------------------
             |               Robust
         los |        IRR   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        died |   .7770984   .0478172    -4.10   0.000     .6888093    .8767039
         hmo |   .9273564   .0494403    -1.41   0.157     .8353461    1.029501
       type2 |   1.308013   .0871756     4.03   0.000     1.147841    1.490536
       type3 |   2.152888   .4700841     3.51   0.000     1.403334    3.302795
-------------+----------------------------------------------------------------
    /lnalpha |   -.630108   .0764019                      -.779853    -.480363
-------------+----------------------------------------------------------------
       alpha |   .5325343   .0406866                      .4584734    .6185588
------------------------------------------------------------------------------
                    
predict plos

tablist los plos, sort(v) clean

   los       plos   Freq  
      1   6.662014     18  
      1   7.183877     70  
      1   8.572936      2  
      1   8.714004      2  
      1   9.244488     13  
      1   9.396606      8  
      1   12.09191      7  
      1   15.46608      4  
      1   19.90235      2  
      2   6.662014      7  
      2   7.183877     22  
      2   8.572936      3  
      2   9.244488     22  
      2   9.396606      5  
      2   12.09191      6  
      2   15.46608      5  
      2   19.90235      1  
      3   6.662014      3  
      3   7.183877     17  
      3   8.572936      9  
      3   8.714004      2  
      3   9.244488     33  
      3   9.396606      5  
      3   11.21351      1  
      3   12.09191      2  
      3   15.46608      3  
      4   6.662014      5  
      4   7.183877     15  
      4   8.572936     11  
      4   8.714004      1  
      4   9.244488     50  
      4   9.396606      9  
      4   11.21351      1  
      4   12.09191      8  
      4   15.46608      2  
      4   19.90235      2  
      5   6.662014      2  
      5   7.183877     19  
      5   8.572936     16  
      5   9.244488     61  
      5   9.396606      5  
      5   11.21351      3  
      5   12.09191      9  
      5   14.34257      1  
      5   15.46608      5  
      5   18.45657      1  
      5   19.90235      1  
      6   6.662014      3  
      6   7.183877     10  
      6   8.572936     11  
      6   9.244488     50  
      6   9.396606      6  
      6   11.21351      2  
      6   12.09191     11  
      6   15.46608      1  
      6   19.90235      3  
      7   6.662014      3  
      7   7.183877     20  
      7   8.572936     16  
      7   8.714004      1  
      7   9.244488     54  
      7   9.396606     10  
      7   11.21351      2  
      7   12.09191      8  
      7   15.46608      1  
      7   19.90235      1  
      8   6.662014      3  
      8   7.183877     18  
      8   8.572936      8  
      8   8.714004      1  
      8   9.244488     49  
      8   9.396606      4  
      8   12.09191      7  
      8   15.46608      1  
      8   19.90235      1  
      9   6.662014      3  
      9   7.183877      5  
      9   8.572936     15  
      9   9.244488     34  
      9   9.396606      7  
      9   12.09191      6  
      9   15.46608      1  
      9   19.90235      3  
     10   6.662014      3  
     10   7.183877     18  
     10   8.572936      2  
     10   9.244488     53  
     10   9.396606      2  
     10   12.09191      7  
     10   15.46608      1  
     10   19.90235      3  
     11   6.662014      3  
     11   7.183877     10  
     11   8.572936      9  
     11   9.244488     32  
     11   9.396606      1  
     11   11.21351      1  
     11   12.09191      8  
     11   15.46608      2  
     11   19.90235      4  
     12   6.662014      2  
     12   7.183877     10  
     12   8.572936      6  
     12   9.244488     35  
     12   9.396606      3  
     12   11.21351      2  
     12   12.09191     10  
     12   19.90235      2  
     13   6.662014      3  
     13   7.183877      6  
     13   8.572936      5  
     13   9.244488     19  
     13   9.396606      2  
     13   11.21351      1  
     13   12.09191      6  
     13   15.46608      1  
     14   6.662014      6  
     14   7.183877      9  
     14   8.572936      3  
     14   9.244488     19  
     14   9.396606      3  
     14   11.21351      2  
     14   12.09191      3  
     14   15.46608      1  
     14   19.90235      3  
     15   6.662014      1  
     15   7.183877      6  
     15   8.572936      2  
     15   9.244488     18  
     15   9.396606      3  
     15   12.09191      8  
     15   19.90235      3  
     16   7.183877      8  
     16   8.572936      2  
     16   9.244488     15  
     16   9.396606      1  
     16   11.21351      2  
     16   12.09191     12  
     16   15.46608      2  
     16   19.90235      1  
     17   6.662014      1  
     17   7.183877      6  
     17   8.572936      3  
     17   9.244488     11  
     17   9.396606      4  
     17   15.46608      2  
     17   18.45657      1  
     17   19.90235      1  
     18   6.662014      1  
     18   7.183877      3  
     18   8.572936      3  
     18   8.714004      1  
     18   9.244488     13  
     18   12.09191      1  
     18   19.90235      1  
     19   6.662014      1  
     19   7.183877      2  
     19   8.572936      3  
     19   8.714004      2  
     19   9.244488      8  
     19   12.09191      4  
     19   15.46608      4  
     20   6.662014      1  
     20   7.183877      4  
     20   9.244488      9  
     20   9.396606      2  
     20   12.09191      3  
     21   7.183877      3  
     21   8.572936      1  
     21   9.244488      8  
     21   9.396606      2  
     21   11.21351      1  
     21   12.09191      3  
     22   7.183877      2  
     22   8.572936      1  
     22   8.714004      1  
     22   9.244488      4  
     22   9.396606      1  
     22   12.09191      2  
     22   15.46608      3  
     22   19.90235      1  
     23   7.183877      1  
     23   8.572936      2  
     23   9.244488      6  
     23   9.396606      1  
     24   7.183877      3  
     24   9.244488      5  
     24   9.396606      2  
     24   19.90235      1  
     25   7.183877      2  
     25   9.244488      2  
     26   7.183877      2  
     26   9.244488      2  
     26   12.09191      2  
     26   19.90235      1  
     27   7.183877      1  
     27   9.244488      1  
     27   9.396606      1  
     27   11.21351      1  
     27   12.09191      1  
     27   19.90235      2  
     28   9.244488      1  
     28   9.396606      1  
     28   12.09191      1  
     28   15.46608      2  
     29   8.572936      1  
     29   9.244488      1  
     29   19.90235      1  
     30   9.396606      1  
     31   8.572936      1  
     31   9.244488      1  
     32   7.183877      1  
     32   9.244488      2  
     32   9.396606      1  
     32   12.09191      1  
     32   19.90235      1  
     33   9.244488      1  
     33   9.396606      1  
     34   9.244488      1  
     34   9.396606      2  
     34   11.21351      1  
     34   12.09191      1  
     36   6.662014      1  
     42   19.90235      1  
     43   12.09191      1  
     44   12.09191      2  
     46   9.244488      1  
     46   19.90235      2  
     48   19.90235      1  
     49   15.46608      1  
     50   7.183877      1  
     52   19.90235      1  
     57   19.90235      1  
     59   19.90235      1  
     60   9.244488      1  
     63   12.09191      1  
     65   19.90235      1  
     70   15.46608      1  
     74   19.90235      1  
     91   15.46608      1  
    116   19.90235      1  
     
tab plos

  predicted |
  number of |
     events |      Freq.     Percent        Cum.
------------+-----------------------------------
   6.662014 |         70        4.68        4.68
   7.183877 |        294       19.67       24.35
   8.572936 |        135        9.03       33.38
   8.714004 |         11        0.74       34.11
   9.244488 |        635       42.47       76.59
   9.396606 |         93        6.22       82.81
   11.21351 |         20        1.34       84.15
   12.09191 |        141        9.43       93.58
   14.34257 |          1        0.07       93.65
   15.46608 |         44        2.94       96.59
   18.45657 |          2        0.13       96.72
   19.90235 |         49        3.28      100.00
------------+-----------------------------------
      Total |      1,495      100.00
      
univar los plos

                                        -------------- Quantiles --------------
Variable       n     Mean     S.D.      Min      .25      Mdn      .75      Max
-------------------------------------------------------------------------------
     los    1495     9.85     8.83     1.00     4.00     8.00    13.00   116.00
    plos    1495     9.51     2.63     6.66     8.57     9.24     9.24    19.90
-------------------------------------------------------------------------------
      
corr los plos
(obs=1495)

             |      los     plos
-------------+------------------
         los |   1.0000
        plos |   0.3060   1.0000
        

/* Summary Table  
      variable  model     log likelihood     aic
      los       poisson     -6846.9485       9.1651485
      los       nbreg       -4782.5989       6.4034768
      newlos    poisson     -7229.6375       9.677107
      newlos    nbreg       -4742.6087       6.3499782 
      los       zpt         -6846.6528       9.1647529
      los       ztnb        -4737.535        6.3431906 */

The zero-truncated models provided only a slight improvement over the negative binomial with the subtraction trick and also slightly better for than the standrad poisson regression.

In the final analysis, the predicted counts 't seem to match the observed counts only moderately well. This may be due, in part, to the fact that there are only eight different covariate patterns among the predictors, one of which, was not significant.

Categorical Data Analysis Course

Phil Ender 6dec05