7.1.7 - Question 1: The Univariate Case

Example 7-7: Spouse Data (Question 1) Section

Question 1: Do the husbands respond to the questions in the same way as their wives?

Before considering the multivariate case let's review the univariate approach to answering this question. In this case, we will compare the responses to a single question.

Univariate Paired t-test Case: Consider comparing the responses to a single question.

A notation will be as follows:

  • \(X _ { 1 i }\) = response of husband i - the first member of pair i
  • \(X _ { 2 i }\) = response of wife i - the second member of pair i
  • \(\mu _ { 1 }\) = population mean for the husbands - the first population
  • \(\mu _ { 2 }\) = population mean for the wives - the second population
Note! It is completely arbitrary which population is considered the first population and which is considered the second population. It is just necessary to keep track of which way they were labeled so that we are consistent with our choice.

Our objective here is to test the null hypothesis that the population means are equal against the alternative hypothesis that means are not equal, as described in the expression below:

\(H_0\colon \mu_1 =\mu_2 \) against \(H_a\colon \mu_1 \ne \mu_2\)

In the univariate course, you learned that the null hypothesis is tested as follows. First, we define \(Y _ { i }\) to be the difference in responses for the \(i^{th}\) pair of observations. In this case, this will be the difference between husband i and wife i. Likewise, we can also define \(\mu _ { Y }\) to be the population mean of these differences, which is the same as the difference between the population means \(\mu _ { 1 }\) and \(\mu _ { 2 }\), both as noted below:

\(Y_i = X_{1i}-X_{2i}\) and \(\mu_Y = \mu_1-\mu_2\)

Testing the null hypothesis for the equality of the population means is going to be equivalent to testing the null hypothesis that \(\mu _ { Y }\) is equal to 0 against the general alternative that \(\mu _ { Y }\) is not equal to 0.

\(H_0\colon \mu_Y =0 \) against \(H_a\colon \mu_Y \ne 0\)

This hypothesis is tested using the paired t-test.

We will define \(\bar{y}\) to be the sample mean of the \(Y _ { i }\)'s:

\(\bar{y} = \dfrac{1}{n}\sum_{i=1}^{n}Y_i\)

We will also define \(s_{2}Y\) to be the sample variance of the \(Y _ { i }\)'s:

\(s^2_Y = \dfrac{\sum_{i=1}^{n}Y^2_i - (\sum_{i=1}^{n}Y_i)^2/n}{n-1}\)

We will make the usual four assumptions in doing this:

  1. The \(Y _ { i }\)'s have common mean \(\mu _ { Y }\)
  2. Homoskedasticity. The \(Y _ { i }\)'s have common variance \(\sigma^2_Y\).
  3. Independence. The \(Y _ { i }\)'s are independently sampled.
  4. Normality. The \(Y _ { i }\)'s are normally distributed.

The test statistic is a t-statistic which is, in this case, equal to the sample mean divided by the standard error as shown below:

\[t = \frac{\bar{y}}{\sqrt{s^2/n}} \sim t_{n-1}\]

Under the null hypothesis, \(H _ { o }\) this test statistic is going to be t-distributed with n-1 degrees of freedom and we will reject \(H _ { o }\) at level \(α\) if the absolute value of the t-value exceeds the critical value from the t-distribution with n-1 degrees of freedom evaluated at \(α\) over 2.

\(|t| > t_{n-1, \alpha/2}\)