12.2 - Two Variances

Let's now recall the theory necessary for developing a hypothesis test for testing the equality of two population variances. Suppose \(X_1 , X_2 , \dots, X_n\) is a random sample of size n from a normal population with mean \(\mu_X\) and variance \(\sigma^2_X\). And, suppose, independent of the first sample, \(Y_1 , Y_2 , \dots, Y_m\) is another random sample of size m from a normal population with \(\mu_Y\) and variance \(\sigma^2_Y\). Recall then, in this situation, that:

\(\dfrac{(n-1)S^2_X}{\sigma^2_X} \text{ and } \dfrac{(m-1)S^2_Y}{\sigma^2_Y}\)

have independent chi-square distributions with n−1 and m−1 degrees of freedom, respectively. Therefore:

\( {\displaystyle F=\frac{\left[\frac{\color{red}\cancel {\color{black}(n-1)} \color{black}S_{X}^{2}}{\sigma_{x}^{2}} /\color{red}\cancel {\color{black}(n- 1)}\color{black}\right]}{\left[\frac{\color{red}\cancel {\color{black}(m-1)} \color{black}S_{Y}^{2}}{\sigma_{Y}^{2}} /\color{red}\cancel {\color{black}(m-1)}\color{black}\right]}=\frac{S_{X}^{2}}{S_{Y}^{2}} \cdot \frac{\sigma_{Y}^{2}}{\sigma_{X}^{2}}} \)

follows an F distribution with n−1 numerator degrees of freedom and m−1 denominator degrees of freedom. Therefore, if we're interested in testing the null hypothesis:

\(H_0 \colon \sigma^2_X=\sigma^2_Y\) (or equivalently \(H_0 \colon\dfrac{\sigma^2_Y}{\sigma^2_X}=1\))

against any of the alternative hypotheses:

\(H_A \colon \sigma^2_X \neq \sigma^2_Y,\quad H_A \colon \sigma^2_X >\sigma^2_Y,\text{ or }H_A \colon \sigma^2_X <\sigma^2_Y\)

we can use the test statistic:

\(F=\dfrac{S^2_X}{S^2_Y}\)

and follow the standard hypothesis testing procedures. When doing so, we might also want to recall this important fact about the F-distribution:

\(F_{1-(\alpha/2)}(n-1,m-1)=\dfrac{1}{F_{\alpha/2}(m-1,n-1)}\)

so that when we use the critical value approach for a two-sided alternative:

\(H_A \colon\sigma^2_X \neq \sigma^2_Y\)

we reject if the test statistic F is too large:

\(F \geq F_{\alpha/2}(n-1,m-1)\)

or if the test statistic F is too small:

\(F \leq F_{1-(\alpha/2)}(n-1,m-1)=\dfrac{1}{F_{\alpha/2}(m-1,n-1)}\)

Okay, let's take a look at an example. In the last lesson, we performed a two-sample t-test (as well as Welch's test) to test whether the mean fastest speed driven by the population of male college students differs from the mean fastest speed driven by the population of female college students. When we performed the two-sample t-test, we just assumed the population variances were equal. Let's revisit that example again to see if our assumption of equal variances is valid.

Example 12-2

A psychologist was interested in exploring whether or not male and female college students have different driving behaviors. The particular statistical question she framed was as follows:

Is the mean fastest speed driven by male college students different than the mean fastest speed driven by female college students?

The psychologist conducted a survey of a random \(n = 34\) male college students and a random \(m = 29\) female college students. Here is a descriptive summary of the results of her survey:

Males (X)	Females (Y)
\(n = 34\) \(\bar{x} = 105.5\) \(s_x = 20.1\)	\(m = 29\) \(\bar{y} = 90.9\) \(s_y = 12.2\)

Is there sufficient evidence at the \(\alpha = 0.05\) level to conclude that the variance of the fastest speed driven by male college students differs from the variance of the fastest speed driven by female college students?

Answer

We're interested in testing the null hypothesis:

\(H_0 \colon \sigma^2_X=\sigma^2_Y\)

against the alternative hypothesis:

\(H_A \colon\sigma^2_X \neq \sigma^2_Y\)

The value of the test statistic is:

\(F=\dfrac{12.2^2}{20.1^2}=0.368\)

(Note that I intentionally put the variance of what we're calling the Y sample in the numerator and the variance of what we're calling the X sample in the denominator. I did this only so that my results match the Minitab output we'll obtain on the next page. In doing so, we just need to make sure that we keep track of the correct numerator and denominator degrees of freedom.) Using the critical value approach, we divide the significance level \(\alpha = 0.05\) into 2, to get 0.025, and put one of the halves in the left tail, and the other half in the other tail. Doing so, we get that the lower cutoff value is 0.478 and the upper cutoff value is 2.0441:

Because the test statistic falls in the rejection region, that is, because \(F = 0.368 ≤ 0.478\), we reject the null hypothesis in favor of the alternative hypothesis. There is sufficient evidence at the \(\alpha = 0.05\) level to conclude that the population variances are not equal. Therefore, the assumption of equal variances that we made when performing the two-sample t-test on these data in the previous lesson does not appear to be valid. It would behoove us to use Welch's t-test instead.

^[1]	Link
↥	Has Tooltip/Popover
	Toggleable Visibility