Linear Statistical Models: Regression

"Four Regressions" in Concert


Students just starting out in regression analysis are often confused by the different computer output generated by different statistics programs. The truth of the matter is that the computer printouts are really more similar than they are different. Information is organized a little bit differently from one program to the next and sometimes different programs use different names for the same terms. In this unit, we will present the same regression analysis run in Stata, SAS, and SPSS. We will use Agresti and Finlay's crime dataset (with Washington, D.C. removed) from Chapter 9.

Stata Output

  Source |       SS       df       MS                  Number of obs =      50
---------+------------------------------               F(  5,    44) =   23.58
   Model |  3123844.66     5  624768.933               Prob > F      =  0.0000
Residual |  1165780.56    44  26495.0127               R-squared     =  0.7282
---------+------------------------------               Adj R-squared =  0.6973
   Total |  4289625.22    49  87543.3718               Root MSE      =  162.77

------------------------------------------------------------------------------
   crime |      Coef.   Std. Err.       t     P>|t|                       Beta
---------+--------------------------------------------------------------------
pctmetro |   7.541909   1.170283      6.445   0.000                   .5525032
pctwhite |  -2.489705   2.579502     -0.965   0.340                   -.093095
   pcths |   3.300208   7.251138      0.455   0.651                   .0628064
 poverty |   21.28312   10.12292      2.102   0.041                   .3083509
  single |   80.88412   20.28576      3.987   0.000                   .4032581
   _cons |   -1173.33   632.2908     -1.856   0.070                          .
------------------------------------------------------------------------------

SAS Output

Model: MODEL1
Dependent Variable: CRIME

Analysis of Variance

                         Sum of         Mean
Source          DF      Squares       Square      F Value       Prob>F

Model            5 3123844.6346 624768.92693       23.581       0.0001
Error           44 1165780.5854 26495.013304
C Total         49   4289625.22

    Root MSE     162.77289     R-square       0.7282
    Dep Mean     566.66000     Adj R-sq       0.6973
    C.V.          28.72497

Parameter Estimates

                 Parameter      Standard    T for H0:
Variable  DF      Estimate         Error   Parameter=0    Prob > |T|

INTERCEP   1  -1173.329937  632.29078909        -1.856        0.0702
PCTMETRO   1      7.541909    1.17028250         6.445        0.0001
PCTWHITE   1     -2.489704    2.57950225        -0.965        0.3397
PCTHS      1      3.300204    7.25113773         0.455        0.6513
POVERTY    1     21.283118   10.12291688         2.102        0.0413
SINGLE     1     80.884132   20.28576012         3.987        0.0002

              Standardized
Variable  DF      Estimate

INTERCEP   1    0.00000000
PCTMETRO   1    0.55250322
PCTWHITE   1   -0.09309493
PCTHS      1    0.06280634
POVERTY    1    0.30835078
SINGLE     1    0.40325817

SPSS Output

                   * * * *   M U L T I P L E   R E G R E S S I O N   * * * *


Listwise Deletion of Missing Data

Equation Number 1    Dependent Variable..   CRIME

Block Number  1.  Method:  Enter


Variable(s) Entered on Step Number  1..    SINGLE
                                    2..    PCTMETRO
                                    3..    PCTHS
                                    4..    PCTWHITE
                                    5..    POVERTY


Multiple R          .85337   Analysis of Variance
R Square            .72823               DF   Sum of Squares     Mean Square
Adjusted R Square   .69735   Regression   5    3123844.63463    624768.92693
Standard Error   162.77289   Residual    44    1165780.58537     26495.01330

                             F =      23.58062       Signif F =  .0000


------------------ Variables in the Equation ------------------

Variable              B        SE B       Beta         T  Sig T
PCTMETRO       7.541909    1.170282    .552503     6.445  .0000
PCTWHITE      -2.489704    2.579502   -.093095     -.965  .3397
PCTHS          3.300204    7.251138    .062806      .455  .6513
POVERTY       21.283118   10.122917    .308351     2.102  .0413
SINGLE        80.884132   20.285760    .403258     3.987  .0002
(Constant) -1173.329937  632.290789               -1.856  .0702


End Block Number   1   All requested variables entered.

R Output

Call:
lm(formula = crime ~ pctmetro + pctwhite + pcths + poverty + single)

Coefficients:
(Intercept)     pctmetro     pctwhite        pcths      poverty       single  
  -1796.322        7.609       -4.486        8.658       26.250      109.452

summary(m1)
  
Call:
lm(formula = crime ~ pctmetro + pctwhite + pcths + poverty + single)

Residuals:
     Min       1Q   Median       3Q      Max 
-533.262  -87.243   -9.825  110.443  397.775 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -1796.322    668.614  -2.687   0.0101 *  
pctmetro        7.609      1.295   5.875 4.79e-07 ***
pctwhite       -4.486      2.777  -1.615   0.1132    
pcths           8.658      7.826   1.106   0.2745    
poverty        26.250     11.082   2.369   0.0222 *  
single        109.452     20.354   5.377 2.59e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 180.2 on 45 degrees of freedom
Multiple R-squared: 0.8499,	Adjusted R-squared: 0.8332 
F-statistic: 50.94 on 5 and 45 DF,  p-value: < 2.2e-16 


Linear Statistical Models Course

Phil Ender, 14dec99