Let's consider two variables from the **hsb2** dataset, **write** and **read**. In this
example we will consider **write** to be our response variable and **read** to be our
predictor variable, i.e., we want to be able to predict **write** from knowledge of **read**.
Here is a scatter plot of the two variables.

Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- write | 27 50.62963 9.245157 31 65 3* | 1 3. | 79 4* | 01124 4. | 6667 5* | 22244 5. | 77999 6* | 1222 6. | 5We can do this for each of the values of

read | N mean write ---------+-------------------------- 28 | 1 46.00 31 | 1 36.00 34 | 6 40.67 35 | 1 35.00 36 | 3 50.00 37 | 2 40.50 39 | 8 43.63 41 | 2 53.00 42 | 13 46.00 43 | 2 55.50 44 | 13 44.92 45 | 2 56.00 46 | 1 52.00 47 | 27 50.63 48 | 1 49.00 50 | 18 49.17 52 | 14 56.00 53 | 1 61.00 54 | 1 63.00 55 | 13 54.77 57 | 14 56.86 60 | 9 56.44 61 | 1 59.00 63 | 16 57.00 65 | 9 62.56 66 | 1 67.00 68 | 11 60.27 71 | 2 65.00 73 | 5 63.40 76 | 2 57.50 ---------+-------------------------- Total | 200 52.78As you can see, some values of of

In linear regression we try to explain
the change in the conditional mean of the response variable as a linear function of the predictor plus
random error, i.e., **Y = f(X) + e.** We can plot this linear function as a straight line thusly,

As was stated earlier the function **f(X)** is a linear function that defines a straight line.
Any straight line can be completely determined by two parameters, the slope (**b1**) and the
intercept (**b0**). For our example, **b0** = 23.96 and**b1** = 0.55, i.e.,
**predicted write = f(x) = 23.96 + 0.55*read**. In other words, the regression line is the line
of all possible predicted values. Here is a table of all possible predicted scores, each of which
falls on the least squares regression line.

reading | predicted score | score ----------+------------- 28 | 39.4072 31 | 41.0623 34 | 42.7174 35 | 43.2691 36 | 43.8208 37 | 44.3725 39 | 45.4759 41 | 46.5794 42 | 47.1311 43 | 47.6828 44 | 48.2345 45 | 48.7862 46 | 49.3379 47 | 49.8896 48 | 50.4413 50 | 51.5447 52 | 52.6481 53 | 53.1998 54 | 53.7515 55 | 54.3032 57 | 55.4066 60 | 57.0617 61 | 57.6135 63 | 58.7169 65 | 59.8203 66 | 60.3720 68 | 61.4754 71 | 63.1305 73 | 64.2339 76 | 65.8890Here are the results of running a regression of

------------------------------------------------------------------------------ write | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- read | .5517051 .0527178 10.47 0.000 .4477445 .6556656 _cons | 23.95944 2.805744 8.54 0.000 18.42647 29.49242 ------------------------------------------------------------------------------And here are the results of running a regression of conditional means of

------------------------------------------------------------------------------ mean write | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- read | .5517051 .0196244 28.11 0.000 .5130053 .5904048 _cons | 23.95944 1.04445 22.94 0.000 21.89977 26.01912 ------------------------------------------------------------------------------Note that the regression slopes and intercepts are the same in both models but that the standard errors are different. The standard errors in the first model are the correct ones.

Phil Ender, 4Jan05