Linear Statistical Models: Regression
Regression Assumptions
Assumptions in Regression Analysis
- Independence
- The residuals are serially independent (no autocorrelation).
- The residuals are not correlated with any of the independent (predictor) variables.
- Linearity
- The relationship between the dependent variable and each of the independent variables is linear.
- Mean of Residuals
- The mean of the residuals is zero.
- Homogeneity of Variance
- The variance of the residuals at all levels of the independent variables is constant.
- Errors in Variables
- The independent (predictor) variables are measured without error.
- Model Specification
- All relevant variables are included in the model.
- No irrelevant variables are included in the model.
- Normality
- The residuals are normally distributed.
This assumption is needed for valid tests of significance but not for estimation of the regression
coefficients.
Violations of Assumptions
Regression analysis is generally robust to violations of assumptions
Except for:
- Measurement Errors
- Specification Errors
Measurement Error
In the dependent variable:
- Does not bias estimates of the regression coefficient, b.
- Increases standard error of estimate thus weakening tests of significance.
In the independent variable:
- Leads to under estimation of the regressions coefficient, b.
Specification Errors
Omission of relevant variables.
Inclusion of irrelevant variables.
Using linear regression when the relationship is no linear.
Linear Statistical Models Course
Phil Ender, 29dec99