8.3 - Paired Means

8.3 - Paired Means

In Lesson 1 we learned about independent samples and paired samples. When we have two independent samples, the observations in the two groups are unrelated to one another and are not matched in any meaningful way. We'll learn how to compare the means of two independent groups in Lesson 9.

With paired samples, the observations in the two groups are matched in a meaningful way. These are also known as dependent samples. Most often this occurs when data are collected twice from the same participants, called repeated measures. For example, think of studying the effectiveness of a diet plan. You would weigh each participant prior to starting the diet and again following some time on the diet. Depending on how much weight they lost you would determine if the diet was effective. Paired data does not always need to involve two measurements on the same subject; it can also involve taking one measurement on each of two related subjects. For example, we may study husband-wife pairs, mother-son pairs, or pairs of twins.

In constructing a dependent samples confidence interval or conducting a dependent samples hypothesis test, the difference score is computed for each individual or pair. From there, the procedures are the same that you used for constructing confidence intervals and hypothesis tests for single sample means. As with one sample mean, if the sample size is at least 30, the sampling distribution for the difference in paired means can be approximated using a \(t\) distribution. 

In terms of symbols, the population parameter of interest is the mean difference in the population "\(\mu_d\)." This is estimated using the mean difference in the sample "\(\overline x_d\)."


8.3.1 - Confidence Intervals

8.3.1 - Confidence Intervals

Recall the general form of a confidence interval...

 \(sample\ statistic\pm\underbrace{(multiplier)\ (standard\ error)}_{\textbf{margin of error}}\).

The formula for constructing a confidence interval for the difference in paired means is almost identical to the formula for constructing a confidence interval for one mean. Note that the only change is the subscript d which stands for difference.

Confidence Interval for the Difference Between Two Paired Means

\(\underbrace{\overline{x}_d}_{\text{sample statistic}} \pm \overbrace{t^*}^{\text{multiplier}} \underbrace{\left(\dfrac{s_d}{\sqrt{n}}\right)}_{\text{standard error}}\)

\(t^*\) is the multiplier with \(df = n-1\)

 


8.3.1.1. - Example: Change in Knowledge

8.3.1.1. - Example: Change in Knowledge

An educational research study is designed so that participants complete a measure of demonstrated knowledge twice. The researcher wants to estimate the change in scores from the first to second administrations (i.e., pre- and post-test). Data are paired by participant. The researcher subtracted pre-test scores from the post test scores and found a mean increase of 6.560 with a standard deviation of 3.867 for \(n=100\). She wants to construct a 95% confidence interval for the mean difference.

First, we'll find the appropriate multiplier.

\(df=n-1=100-1=99\)

t Distribution showing the multipliers for a 95% confidence interval given 99 degrees of freedom

For a 95% confidence interval: \(t_{df=99}=1.984\)

\(6.560 \pm 1.984 \left(\frac{3.867}{\sqrt{100}}\right)=6.560 \pm 0.767=[5.793, 7.327]\)

We are 95% confident that the difference between post- and pre- test scores is between 5.793 and 7.327.

Data from Zimmerman, W. A. (2015). Impact of Instructional Materials Eliciting Low and High Cognitive Load on Self-Efficacy and Demonstrated Knowledge (Unpublished doctoral dissertation). The Pennsylvania State University, University Park, PA.


8.3.1.2 - Video Example: Difference in Exam Scores

8.3.1.2 - Video Example: Difference in Exam Scores

8.3.2 - Hypothesis Testing

8.3.2 - Hypothesis Testing

Below are the procedures for conducting a hypothesis test for two paired means. This is often referred to as a "paired means \(t\) test," "dependent means \(t\) test," or "matched pairs \(t\) test." 

1. Check any necessary assumptions and write null and alternative hypotheses.

Data must be paired. The difference between the two groups must be normally distributed in the population or the sample size must be at least 30.

The possible combinations of null and alternative hypotheses are:

Research Question Is the mean difference different from 0? Is the mean difference greater than 0? Is the mean difference less than 0?
Null Hypothesis, \(H_{0}\) \(\mu_d = 0 \) \(\mu_d = 0 \) \(\mu_d = 0 \)
Alternative Hypothesis, \(H_{a}\) \(\mu_d \neq 0 \) \(\mu_d > 0 \) \(\mu_d < 0 \)
Type of Hypothesis Test Two-tailed, non-directional Right-tailed, directional Left-tailed, directional

Where \( \mu_d \) is the hypothesized difference in the population.

2. Calculate an appropriate test statistic.

The calculation of the test statistic for dependent samples is similar to the calculation you performed earlier in this lesson for a single sample mean. In this formula, \(\overline{x}_d\) is used in place of \(\overline{x}\) and \(s_d\) is used in place of \(s\):

Test Statistic for Dependent Means

\(t=\frac{\bar{x}_d-\mu_0}{\dfrac{s_d}{\sqrt{n}}}\)

\(\overline{x}_d\) = observed sample mean difference
\(\mu_0\) = mean difference specified in the null hypothesis
\(s_d\) = standard deviation of the differences
\(n\) = sample size (i.e., number of unique individuals)

Observed Sample Mean Difference
\(\overline{x}_d=\dfrac{\Sigma{x}_d}{n}\)
\(x_d\) = observed difference
Standard Deviation of the Differences
\(s_d=\sqrt{\dfrac{\sum (x_d-\overline{x}_d)^{2}}{n-1}}\)
3. Determine the p value associated with the test statistic.

When testing hypotheses about a mean difference, a \(t\) distribution is used to find the \(p\) value. The degrees of freedom are equal to \(n-1\) where \(n\) is the number of pairs. 

4. Decide between the null and alternative hypotheses.

 If \(p \leq \alpha\) reject the null hypothesis. If \(p>\alpha\) fail to reject the null hypothesis.

5. State a "real world" conclusion.

Based on your decision in Step 4, write a conclusion in terms of the original research question.


8.3.2.1 - Example: Quiz Scores

8.3.2.1 - Example: Quiz Scores

Below is an example of conducting a paired means \(t\) test by hand using raw data. Next, you will learn how this can be conducted most efficiently in Minitab.

Research question: Are scores on two quizzes different?

Data were collected from 9 students and a paired means \(t\) test was performed using hand calculations:

Student ID Quiz 1 Quiz 2
001 98 94
002 100 98
003 95 98
004 90 88
005 90 89
006 92 91
007 80 84
008 78 80
009 88 88
Step 1: Check assumptions and write hypotheses

There are two assumptions: (1) data are paired and (2) distribution of differences is normally distribution in the population or the sample size is at least 30. The data are paired because for each student we have a quiz 1 and a quiz 2 score. We do not know if the differences are normally distributed in the population and the sample size is small, but in the video above we created a histogram of the differences and found that the sample was approximately normally distributed, so this assumption has been met and we can perform a paired means \(t\) test.

Given \(\mu_d = \mu_1 - \mu_2\), our hypotheses are:
\(H_0: \mu_d = 0\)
\(H_a: \mu_d \ne 0\)

Step 2: Calculate test statistic
Test Statistic for Dependent Means

\(t=\frac{\bar{x}_d-\mu_0}{\frac{s_d}{\sqrt{n}}}\)

\(\overline{x}_d\) = observed sample mean difference
\(\mu_0\) = mean difference specified in the null hypothesis
\(s_d\) = standard deviation of the differences
\(n\) = sample size (i.e., number of unique individuals)

Student ID Quiz 1 Quiz 2 Difference (\(X_d\)) \(X_d - \overline{X}_d\) \((X_d - \overline{X}_d)^2\)
001 98 94 4 3.889 15.123
002 100 98 2 1.889 3.568
003 95 98 -3 -3.111 9.679
004 90 88 2 1.889 3.568
005 90 89 1 0.889 0.790
006 92 91 1 0.889 0.790
007 80 84 -4 -4.111 16.901
008 78 80 -2 -2.111 4.457
009 88 88 0 -0.111 0.012

Mean of the differences: \(\overline{X}_d=\frac{\Sigma{X}_d}{n}=\frac{1}{9}\)

For a review of computing standard deviation, see Lesson 2.

Sum of squares: \(\Sigma (X_d - \overline{X}_d)^2 = 54.889\)

Standard deviation of the differences: \(s_d=\sqrt{\frac{\sum (X_d-\overline{X}_d)^{2}}{n-1}} = \sqrt{\frac{54.889}{9-1}}=2.619\)

Test statistic: \(t=\frac{\overline{X}_d- \mu_0}{\frac{s_d}{\sqrt{n}}}=\frac{\frac{1}{9}}{\frac{2.619}{\sqrt{9}}}=0.127\)

\(df=n-1=9-1=8\)

Step 3: Determine p-value

We can construct a \(t\) distribution with 8 degrees of freedom and determine what proportion of the curve falls beyond a \(t\) score of 0.127. This is a two-tailed test, so we need to take into account both the left and right sides of the curve. 

Distribution Plot of Density vs X - T, df=8

\(p=0.4510+0.4510=0.9020\)

Step 4: Make a decision

We will compare our \(p\)-value from step 3 to a standard alpha level of 0.05.

Because \(p>\alpha\), we fail to reject the null hypothesis.

Step 5: State conclusion

There is not sufficient evidence to state that scores on the two quizzes are different.

Note! The following video uses Minitab Express not Minitab to find the p-value

8.3.3 - Minitab: Paired Means Test

8.3.3 - Minitab: Paired Means Test

The steps for constructing a confidence interval or conducting a paired means \(t\) in Minitab are identical. The output that the procedure provides includes both the confidence interval and the \(p\)-value for determining statistical significance.

Minitab®  – Conducting a Paired Means Test

Let's compare students' SAT-Math scores to their SAT-Verbal scores.  

  1. Open the Minitab file: class_survey.mpx
  2. Select Stat > Basic Statistics > Paired t
  3. Select Each sample is in a column since we have the data in the worksheet
  4. Double click the variable SATM in the box on the left to insert the variable into the Sample 1 box
  5. Double click the variable SATV in the box on the left to insert the variable into the Sample 2 box
  6. Click OK

This should result in the following output:

Paired t: SATM, SATV

Descriptive Statistics
Sample N Mean StDev SE Mean
SATM 215 599.81 84.70 5.78
SATV 215 580.33 82.44 5.62
Estimation for Paired Difference
Mean StDev SE Mean 95% CI for \(\mu_d\)
19.49 89.81 6.12 (7.42, 31.56)

\(\mu\)_difference: population mean of (SATM - SATV)

Test
Null hypothesis H0: \(\mu\)_difference = 0
Alternative hypothesis H1: \(\mu\)_difference ≠ 0
T-Value P-Value
3.18 0.002

On the next page, the five-step hypothesis testing procedure is used to interpret this output. 


8.3.3.1 - Example: SAT Scores

8.3.3.1 - Example: SAT Scores

Example: SAT Scores

This example uses the dataset from Lesson 8.3.3 to walk through the five-step hypothesis testing procedure using the Minitab output.

Research question: Do students score differently on the SAT-Math and SAT-Verbal tests?

1. Check assumptions and write hypotheses

Because the sample size is large (\(n \ge 30\)), the t distribution may be used to approximate the sampling distribution.

\(H_{0}:\mu_d=0\)
\(H_{a}:\mu_d \ne 0\)

2. Calculate the test statistic
Test
Null hypothesis H0: \(\mu_d\) = 0
Alternative hypothesis H1: \(\mu_d\) ≠ 0
T-Value P-Value
3.18 0.002

The t test statistic is 3.18.

3. Determine the p value associated with the test statistic

From the output, the p value is 0.002

4. Make a decision

\(p\leq .05\), therefore our decision is to reject the null hypothesis

5. State a "real world" conclusion

There is evidence that in the population, on average, students' SAT-Math and their SAT-Verbal scores are different. 


8.3.3.2 - Example: Marriage Age (Summarized Data)

8.3.3.2 - Example: Marriage Age (Summarized Data)

In a sample of 105 married heterosexual couples, the average age difference (husband's age - wife's age) was 2.829 years with a standard deviation of 4.995 years. These summary statistics were taken from a data set from the Lock5 textbook. Is there evidence that, on average, in the population, husbands tend to be older than their wives?

First we need to check our assumptions. In this case the sample size is greater than 30 so we can use the t-distribution.

We know n = 105, \(\mu_{\text{husband's age}}-\mu_{\text{wife's age}}=2.829\), and \(s=4.995\). Since we want to know if the husbands are older than their wives then our difference in ages would be positive. So our alternative hypothesis is \(H_a\colon \gt 0\).

To complete this using Minitab...

  1. Select Stat > Basic Statistics > Paired t
  2. Select Summarized data (differences)
  3. For sample size enter 105, enter 2.829 for sample mean and 4.995 for standard deviation.
  4. Select Options
  5. The Hypothesized difference should be 0. (or 0.0)
  6. Select Difference > hypothesized difference for the Alternative hypothesis
  7. Click OK and OK
You should get the following output:
Estimation for Paired Difference
Mean StDev SE Mean 95% CI for \(\mu_d\)
105 2.829 4.995 (1.862, 3.796

\(\mu\)_difference: population mean of (Sample 1 - Sample 2)

Test
Null hypothesis H0: \(\mu\)_difference = 0
Alternative hypothesis H1: \(\mu\)_difference ≠ 0
T-Value P-Value
5.80 0.000

 

Interpret the results

1. Check assumptions and write hypotheses

Because the sample size is large (\(n \ge 30\)), the t distribution may be used to approximate the sampling distribution.

\(H_{0}:\mu_d=0\)
\(H_{a}:\mu_d \gt 0\)

2. Calculate the test statistic
Test
Null hypothesis H0: \(\mu_d\) = 0
Alternative hypothesis H1: \(\mu_d\) > 0
T-Value P-Value
5.80 0.000

The t test statistic is 5.80.

3. Determine the p value associated with the test statistic

From the output, the p value is 0.000

4. Make a decision

\(p\leq .05\), therefore our decision is to reject the null hypothesis

5. State a "real world" conclusion

There is evidence that in the population, on average, the husband's age in heterosexual couples is greater than the wife's age.


Legend
[1]Link
Has Tooltip/Popover
 Toggleable Visibility