7.1 - Confidence Interval for the Mean Response

In this section, we are concerned with the confidence interval, called a "t-interval," for the mean response μ_Y when the predictor values are \(\textbf{X}_{h}=(1, X_{h,1}, X_{h,2}, \dots, X_{h,p-1})^\textrm{T}\). The general formula in words is as always:

Sample estimate ± (t-multiplier × standard error)

First, we define the standard error of the fit at \(\textbf{X}_{h}\) given by:

\(\textrm{se}(\hat{y}_{h})=\sqrt{\textrm{MSE}(\textbf{X}_{h}^{\textrm{T}}(\textbf{X}^{\textrm{T}}\textbf{X})^{-1}\textbf{X}_{h})}.\)

and the confidence interval is:

\(\hat{y}_h \pm t_{(1-\alpha/2, n-p)} \times \textrm{se}(\hat{y}_{h})\)

where:

\(\hat{y}_h\) is the "fitted value" or "predicted value" of the response when the predictor values are \(\textbf{X}_{h}\).
\(t_{(1-\alpha/2, n-p)}\) is the "t-multiplier." Note that the t-multiplier has n-p degrees of freedom because the confidence interval uses the mean square error (MSE) whose denominator is n-p.

Fortunately, we won't have to use the formula to calculate the confidence interval, since statistical software such as Minitab will do the dirty work for us. Here is some Minitab output for the example on IQ and physical characteristics from Lesson 5 (IQ Size data), where we've fit a model with PIQ as the response and Brain and Height as the predictors:

Settings

New Obs	Brain	Height
1	90.0	70.0

Prediction

New Obs	Fit	SE Fit	95% CI	95% PI
1	105.64	3.65	(98.24, 113.04)	(65.35, 145.93)

Here's what the output tells us:

In the section labeled "Settings," Minitab reports the values for \(\textbf{X}_{h}\) (brain size = 90 and height = 70) for which we requested the confidence interval for \(\mu_{Y}\).
In the section labeled "Prediction," Minitab reports a 95% confidence interval. We can be 95% confident that the average performance IQ score of all college students with brain size = 90 and height = 70 is between 98.24 and 113.04.
In the section labeled "Prediction," Minitab also reports the predicted value \(\hat{y}_h\), ("Fit" = 105.64), the standard error of the fit ("SE Fit" = 3.65), and the 95% prediction interval for a new response (which we discuss in the next section).

Factors affecting the width of the t-interval for the mean response \(\left(\mu_{Y}\right)\)

As always, the formula is useful for investigating what factors affect the width of the confidence interval for \(\mu_{Y}\).

As the mean square error (MSE) decreases, the width of the interval decreases. Since MSE is an estimate of how much the data vary naturally around the unknown population regression hyperplane, we have little control over MSE other than making sure that we make our measurements as carefully as possible.
As we decrease the confidence level, the t-multiplier decreases, and hence the width of the interval decreases. In practice, we wouldn't want to set the confidence level below 90%.
As we increase the sample size n, the width of the interval decreases. We have complete control over the size of our sample — the only limitation being our time and financial constraints.
The closer \(\textbf{X}_{h}\) is to the average of the sample's predictor values, the narrower the interval.

Let's see this last claim in action for our IQ example:

Settings

New Obs	Brain	Height
1	79.0	62.0

Prediction

New Obs	Fit	SE Fit	95% CI	95% PI
1	104.81	6.60	(91.42, 118.20)	(63.00, 146.62)

The width of the first confidence interval we calculated earlier (113.04 - 98.24 = 14.80) is shorter than the width of this new interval (118.20 - 91.42 = 26.78), because 90 and 70 are much closer than 79 and 62 are to the sample means (90.7 and 68.4).

When is it okay to use the formula for the confidence interval for \(\mu_{Y}\) ?

When \(\textbf{X}_{h}\) is within the "scope of the model." But, note that \(\textbf{X}_{h}\) does not have to be an actual observation in the data set.
When the "LINE" conditions — linearity, independent errors, normal errors, equal error variances — are met. The formula works okay even if the error terms are only approximately normal. And, if you have a large sample, the error terms can even deviate substantially from normality.