7.5 - Confidence Intervals for Regression Parameters

Before we can derive confidence intervals for \(\alpha\) and \(\beta\), we first need to derive the probability distributions of \(a, b\) and \(\hat{\sigma}^2\). In the process of doing so, let's adopt the more traditional estimator notation, and the one our textbook follows, of putting a hat on greek letters. That is, here we'll use:

\(a=\hat{\alpha}\) and \(b=\hat{\beta}\)

Theorem Section

Under the assumptions of the simple linear regression model:

\(\hat{\alpha}\sim N\left(\alpha,\dfrac{\sigma^2}{n}\right)\)

Proof

Recall that the ML (and least squares!) estimator of \(\alpha\) is:

\(a=\hat{\alpha}=\bar{Y}\)

where the responses \(Y_i\) are independent and normally distributed. More specifically:

\(Y_i \sim N(\alpha+\beta(x_i-\bar{x}),\sigma^2)\)

The expected value of \(\hat{\alpha}\) is \(\alpha\), as shown here:

\(E(\hat{\alpha})=E(\bar{Y})=\frac{1}{n}\sum E(Y_i)=\frac{1}{n}\sum E(\alpha+\beta(x_i-\bar{x})=\frac{1}{n}\left[n\alpha+\beta \sum (x_i-\bar{x})\right]=\frac{1}{n}(n\alpha)=\alpha\)

because \(\sum (x_i-\bar{x})=0\).

The variance of \(\hat{\alpha}\) follow directly from what we know about the variance of a sample mean, namely:

\(Var(\hat{\alpha})=Var(\bar{Y})=\dfrac{\sigma^2}{n}\)

Therefore, since a linear combination of normal random variables is also normally distributed, we have:

\(\hat{\alpha} \sim N\left(\alpha,\dfrac{\sigma^2}{n}\right)\)

as was to be proved!

Theorem Section

Under the assumptions of the simple linear regression model:

\(\hat{\beta}\sim N\left(\beta,\dfrac{\sigma^2}{\sum_{i=1}^n (x_i-\bar{x})^2}\right)\)

Proof

Recalling one of the shortcut formulas for the ML (and least squares!) estimator of \(\beta \colon\)

\(b=\hat{\beta}=\dfrac{\sum_{i=1}^n (x_i-\bar{x})Y_i}{\sum_{i=1}^n (x_i-\bar{x})^2}\)

we see that the ML estimator is a linear combination of independent normal random variables \(Y_i\) with:

\(Y_i \sim N(\alpha+\beta(x_i-\bar{x}),\sigma^2)\)

The expected value of \(\hat{\beta}\) is \(\beta\), as shown here:

\(E(\hat{\beta})=\frac{1}{\sum (x_i-\bar{x})^2}\sum E\left[(x_i-\bar{x})Y_i\right]=\frac{1}{\sum (x_i-\bar{x})^2}\sum (x_i-\bar{x})(\alpha +\beta(x_i-\bar{x}) =\frac{1}{\sum (x_i-\bar{x})^2}\left[ \alpha\sum (x_i-\bar{x}) +\beta \sum (x_i-\bar{x})^2 \right] \\=\beta \)

because \(\sum (x_i-\bar{x})=0\).

And, the variance of \(\hat{\beta}\) is:

\(\text{Var}(\hat{\beta})=\left[\frac{1}{\sum (x_i-\bar{x})^2}\right]^2\sum (x_i-\bar{x})^2(\text{Var}(Y_i))=\frac{\sigma^2}{\sum (x_i-\bar{x})^2}\)

Therefore, since a linear combination of normal random variables is also normally distributed, we have:

\(\hat{\beta}\sim N\left(\beta,\dfrac{\sigma^2}{\sum_{i=1}^n (x_i-\bar{x})^2}\right)\)

as was to be proved!

Theorem Section

Under the assumptions of the simple linear regression model:

\(\dfrac{n\hat{\sigma}^2}{\sigma^2}\sim \chi^2_{(n-2)}\)

and \(a=\hat{\alpha}\), \(b=\hat{\beta}\), and \(\hat{\sigma}^2\) are mutually independent.

Argument

First, note that the heading here says Argument, not Proof. That's because we are going to be doing some hand-waving and pointing to another reference, as the proof is beyond the scope of this course. That said, let's start our hand-waving. For homework, you are asked to show that:

\(\sum\limits_{i=1}^n (Y_i-\alpha-\beta(x_i-\bar{x}))^2=n(\hat{\alpha}-\alpha)^2+(\hat{\beta}-\beta)^2\sum\limits_{i=1}^n (x_i-\bar{x})^2+\sum\limits_{i=1}^n (Y_i-\hat{Y})^2\)

Now, if we divide through both sides of the equation by the population variance \(\sigma^2\), we get:

\(\dfrac{\sum_{i=1}^n (Y_i-\alpha-\beta(x_i-\bar{x}))^2 }{\sigma^2}=\dfrac{n(\hat{\alpha}-\alpha)^2}{\sigma^2}+\dfrac{(\hat{\beta}-\beta)^2\sum\limits_{i=1}^n (x_i-\bar{x})^2}{\sigma^2}+\dfrac{\sum (Y_i-\hat{Y})^2}{\sigma^2}\)

Rewriting a few of those terms just a bit, we get:

\(\dfrac{\sum_{i=1}^n (Y_i-\alpha-\beta(x_i-\bar{x}))^2 }{\sigma^2}=\dfrac{(\hat{\alpha}-\alpha)^2}{\sigma^2/n}+\dfrac{(\hat{\beta}-\beta)^2}{\sigma^2/\sum\limits_{i=1}^n (x_i-\bar{x})^2}+\dfrac{n\hat{\sigma}^2}{\sigma^2}\)

Now, the terms are written so that we should be able to readily identify the distributions of each of the terms. The distributions are:

${\displaystyle\underbrace{\color{black}\frac{\sum\left(Y_{i}-\alpha-\beta\left(x_{i}-\bar{x}\right)\right)^{2}}{\sigma^2}}_{\underset{\text{}}{{\color{blue}x^2_{(n)}}}}=
\underbrace{\color{black}\frac{(\hat{\alpha}-\alpha)^{2}}{\sigma^{2} / n}}_{\underset{\text{}}{{\color{blue}x^2_{(1)}}}}+
\underbrace{\color{black}\frac{(\hat{\beta}-\beta)^{2}}{\sigma^{2} / \sum\left(x_{i}-\bar{x}\right)^{2}}}_{\underset{\text{}}{{\color{blue}x^2_{(1)}}}}+
\underbrace{\color{black}\frac{n \hat{\sigma}^{2}}{\sigma^{2}}}_{\underset{\text{}}{\color{red}\text{?}}}}$

 

Now, it might seem reasonable that the last term is a chi-square random variable with \(n-2\) degrees of freedom. That is .... hand-waving! ... indeed the case. That is:

\(\dfrac{n\hat{\sigma}^2}{\sigma^2} \sim \chi^2_{(n-2)}\)

and furthermore (more hand-waving!), \(a=\hat{\alpha}\), \(b=\hat{\beta}\), and \(\hat{\sigma}^2\) are mutually independent. (For a proof, you can refer to any number of mathematical statistics textbooks, but for a proof presented by one of the authors of our textbook, see Hogg, McKean, and Craig, Introduction to Mathematical Statistics, 6th ed.)

With the distributional results behind us, we can now derive \((1-\alpha)100\%\) confidence intervals for \(\alpha\) and \(\beta\)!

Theorem Section

Under the assumptions of the simple linear regression model, a \((1-\alpha)100\%\) confidence interval for the slope parameter \(\beta\) is:

\(b \pm t_{\alpha/2,n-2}\times \left(\dfrac{\sqrt{n}\hat{\sigma}}{\sqrt{n-2} \sqrt{\sum (x_i-\bar{x})^2}}\right)\)

or equivalently:

\(\hat{\beta} \pm t_{\alpha/2,n-2}\times \sqrt{\dfrac{MSE}{\sum (x_i-\bar{x})^2}}\)

Proof 

Recall the definition of a \(T\) random variable. That is, recall that if:

  1. \(Z\) is a standard normal ( \(N(0,1)\)) random variable
  2. \(U\) is a chi-square random variable with \(r\) degrees of freedom
  3. \(Z\) and \(U\) are independent, then:

\(T=\dfrac{Z}{\sqrt{U/r}}\)

follows a \(T\) distribution with \(r\) degrees of freedom. Now, our work above tells us that:

\(\dfrac{\hat{\beta}-\beta}{\sigma/\sqrt{\sum (x_i-\bar{x})^2}} \sim N(0,1) \) and \(\dfrac{n\hat{\sigma}^2}{\sigma^2} \sim \chi^2_{(n-2)}\) are independent

Therefore, we have that:

\(T=\dfrac{\dfrac{\hat{\beta}-\beta}{\sigma/\sqrt{\sum (x_i-\bar{x})^2}}}{\sqrt{\dfrac{n\hat{\sigma}^2}{\sigma^2}/(n-2)}}=\dfrac{\hat{\beta}-\beta}{\sqrt{\dfrac{n\hat{\sigma}^2}{n-2}/\sum (x_i-\bar{x})^2}}=\dfrac{\hat{\beta}-\beta}{\sqrt{MSE/\sum (x_i-\bar{x})^2}} \sim t_{n-2}\)

follows a \(T\) distribution with \(n-2\) degrees of freedom. Now, deriving a confidence interval for \(\beta\) reduces to the usual manipulation of the inside of a probability statement:

\(P\left(-t_{\alpha/2} \leq \dfrac{\hat{\beta}-\beta}{\sqrt{MSE/\sum (x_i-\bar{x})^2}} \leq t_{\alpha/2}\right)=1-\alpha\)

leaving us with:

\(\hat{\beta} \pm t_{\alpha/2,n-2}\times \sqrt{\dfrac{MSE}{\sum (x_i-\bar{x})^2}}\)

as was to be proved!

Now, for the confidence interval for the intercept parameter \(\alpha\).

Theorem Section

Under the assumptions of the simple linear regression model, a \((1-\alpha)100\%\) confidence interval for the intercept parameter \(\alpha\) is:

\(a \pm t_{\alpha/2,n-2}\times \left(\sqrt{\dfrac{\hat{\sigma}^2}{n-2}}\right)\)

or equivalently:

\(a \pm t_{\alpha/2,n-2}\times \left(\sqrt{\dfrac{MSE}{n}}\right)\)

Proof

The proof, which again may or may not appear on a future assessment, is left for you for homework.

Example 7-3 Section

anchovies

The following table shows \(x\), the catches of Peruvian anchovies (in millions of metric tons) and \(y\), the prices of fish meal (in current dollars per ton) for 14 consecutive years. (Data from Bardach, JE and Santerre, RM, Climate and the Fish in the Sea, Bioscience 31(3), 1981).

Row Price Catch
1 190 7.23
2 160 8.53
3 134 9.82
4 129 10.26
5 172 8.96
6 197 12.27
7 167 10.28
8 239 4.45
9 542 1.87
10 372 4.00
11 245 3.30
12 376 4.30
13 454 0.80
14 410 0.50

Find a 95% confidence interval for the slope parameter \(\beta\).

Answer

The following portion of output was obtained using Minitab's regression analysis package, with the parts useful to us here circled:

The regression equation is
Price = 452 - 29.4 Catch
Predictor Coef SE Coef T P
Constant 452.12 36.82 12.28 0.000
Catch -29.402 5.091 -5.78 0.000

 

\(\color{blue}\hat{\beta}\uparrow\)

     
S = 71.6866 R-Sq = 73.5% R-Sq(adj) = 71.3%

Analysis of Variance

Source DF SS MS   F P
Regression 1        171414       171414 33.36    0.000
Residual Error    12 61668 5139      0.000  

Total

13 233081

     \(\color{blue}MSE\uparrow\)

   

Minitab's basic descriptive analysis can also calculate the standard deviation of the \(x\)-values, 3.91, for us. Therefore, the formula for the sample variance tells us that:

\(\sum\limits_{i=1}^n (x_i-\bar{x})^2=(n-1)s^2=(13)(3.91)^2=198.7453\)

Putting the parts together, along with the fact that \t_{0.025, 12}=2.179\), we get:

\(-29.402 \pm 2.179 \sqrt{\dfrac{5139}{198.7453}}\)

which simplifies to:

\(-29.402 \pm 11.08\)

That is, we can be 95% confident that the slope parameter falls between −40.482 and −18.322. That is, we can be 95% confident that the average price of fish meal decreases between 18.322 and 40.482 dollars per ton for every one unit (one million metric ton) increase in the Peruvian anchovy catch.

Find a 95% confidence interval for the intercept parameter \(\alpha\).

Answer

We can use Minitab (or our calculator) to determine that the mean of the 14 responses is:

\(\dfrac{190+160+\cdots +410}{14}=270.5\)

Using that, as well as the MSE = 5139 obtained from the output above, along with the fact that \(t_{0.025,12} = 2.179\), we get:

\(270.5 \pm 2.179 \sqrt{\dfrac{5139}{14}}\)

which simplifies to:

\(270.5 \pm 41.75\)

That is, we can be 95% confident that the intercept parameter falls between 228.75 and 312.25 dollars per ton.