Introduction to Research Design and Statistics

Regression


Regression can't really be considered separately from correlation. Regression encompasses the linear model, the techniques for obtaining the coefficients for the linear model, and for using the model in prediction and explanation.

Simple Linear Regression

In simple linear regression there is only one predictor variable.

Linear Model

Y is the score on the dependent (response, criterion, outcome) variable; b0 is the regression constant; b1 is the regression coefficient; X is the score on the predictor (independent, explanatory) variable; and e is the error in predicting Y, also called the residual.

Prediction Equation

Yhat is the predicted score.

The goal is to come up with a constant and regression coefficient that defines a straight line, such that, the sum of squared residuals is a minimum.

Formulae for Regression Coefficients

Interpreting the Regression Coefficient

The regression coefficient indicates how much change in the dependent (criterion) variable will occurr when there is one unit of change in the independent (predictor) variable.

Stata Example

Using write as the criterion variable and read as the predictor variable.

use http://www.philender.com/courses/data/hsb2, clear

regress write read, beta

  Source |       SS       df       MS             Number of obs =     200
---------+------------------------------          F(  1,   198) =  109.52
   Model |  6367.42127     1  6367.42127          Prob > F      =  0.0000
Residual |  11511.4537   198  58.1386552          R-squared     =  0.3561
---------+------------------------------          Adj R-squared =  0.3529
   Total |   17878.875   199   89.843593          Root MSE      =  7.6249

-------------------------------------------------------------------------
   write |      Coef.   Std. Err.       t     P>|t|                  Beta
---------+---------------------------------------------------------------
    read |   .5517051   .0527178     10.465   0.000              .5967765
   _cons |   23.95944   2.805744      8.539   0.000                     .
-------------------------------------------------------------------------

corr write read

         |    write     read
---------+------------------
   write |   1.0000
    read |   0.5968   1.0000

Thus, in this case the regression prediction equation is:

generate yhat = 23.96 + .55*read

Someone with a read score of 60 would have a yhat score of 56.96, that is, 56.96 = 23.96 + 0.55*60. Whereas, someone with a read score of 40 would have a predicted score of 23.96 + 0.55*40 = 45.96.

Multiple Linear Regression

Also called just Multiple Regression. Multiple regression involves the use of more than one predictor variable. Multiple regression takes into account correlations among the predictor variables.

Linear Model

Prediction Equation

Interpreting the Regression Coefficients

The regression coefficient indicates how much change in the dependent (criterion) variable will occurr when there is one unit of change in the independent (predictor) variable with all of the other variables in the regression model held constant.

Stata Example

regress write read math female, beta

  Source |       SS       df       MS             Number of obs =     200
---------+------------------------------          F(  3,   196) =   72.52
   Model |  9405.34864     3  3135.11621          Prob > F      =  0.0000
Residual |  8473.52636   196  43.2322773          R-squared     =  0.5261
---------+------------------------------          Adj R-squared =  0.5188
   Total |   17878.875   199   89.843593          Root MSE      =  6.5751

-------------------------------------------------------------------------
   write |      Coef.   Std. Err.       t     P>|t|                  Beta
---------+---------------------------------------------------------------
    read |   .3252389   .0607348      5.355   0.000              .3518093
    math |   .3974826   .0664037      5.986   0.000               .392864
  female |    5.44337   .9349987      5.822   0.000              .2866927
   _cons |   11.89566   2.862845      4.155   0.000                     .
-------------------------------------------------------------------------


Intro Home Page

Phil Ender, 30Jun98