8.3  Paired Means
8.3  Paired MeansIn Lesson 1 we learned about independent samples and paired samples. When we have two independent samples, the observations in the two groups are unrelated to one another and are not matched in any meaningful way. We'll learn how to compare the means of two independent groups in Lesson 9.
With paired samples, the observations in the two groups are matched in a meaningful way. These are also known as dependent samples. Most often this occurs when data are collected twice from the same participants, called repeated measures. For example, think of studying the effectiveness of a diet plan. You would weigh each participant prior to starting the diet and again following some time on the diet. Depending on how much weight they lost you would determine if the diet was effective. Paired data does not always need to involve two measurements on the same subject; it can also involve taking one measurement on each of two related subjects. For example, we may study husbandwife pairs, motherson pairs, or pairs of twins.
In constructing a dependent samples confidence interval or conducting a dependent samples hypothesis test, the difference score is computed for each individual or pair. From there, the procedures are the same that you used for constructing confidence intervals and hypothesis tests for single sample means. As with one sample mean, if the sample size is at least 30, the sampling distribution for the difference in paired means can be approximated using a \(t\) distribution.
In terms of symbols, the population parameter of interest is the mean difference in the population "\(\mu_d\)." This is estimated using the mean difference in the sample "\(\overline x_d\)."
8.3.1  Confidence Intervals
8.3.1  Confidence IntervalsRecall the general form of a confidence interval...
\(sample\ statistic\pm\underbrace{(multiplier)\ (standard\ error)}_{\textbf{margin of error}}\).
The formula for constructing a confidence interval for the difference in paired means is almost identical to the formula for constructing a confidence interval for one mean. Note that the only change is the subscript d which stands for difference.
 Confidence Interval for the Difference Between Two Paired Means
 \(\underbrace{\overline{x}_d}_{\text{sample statistic}} \pm \overbrace{t^*}^{\text{multiplier}} \underbrace{\left(\dfrac{s_d}{\sqrt{n}}\right)}_{\text{standard error}}\)
 \(t^*\) is the multiplier with \(df = n1\)
8.3.1.1.  Example: Change in Knowledge
8.3.1.1.  Example: Change in KnowledgeAn educational research study is designed so that participants complete a measure of demonstrated knowledge twice. The researcher wants to estimate the change in scores from the first to second administrations (i.e., pre and posttest). Data are paired by participant. The researcher subtracted pretest scores from the post test scores and found a mean increase of 6.560 with a standard deviation of 3.867 for \(n=100\). She wants to construct a 95% confidence interval for the mean difference.
First, we'll find the appropriate multiplier.
\(df=n1=1001=99\)
For a 95% confidence interval: \(t_{df=99}=1.984\)
\(6.560 \pm 1.984 \left(\frac{3.867}{\sqrt{100}}\right)=6.560 \pm 0.767=[5.793, 7.327]\)
We are 95% confident that the difference between post and pre test scores is between 5.793 and 7.327.
Data from Zimmerman, W. A. (2015). Impact of Instructional Materials Eliciting Low and High Cognitive Load on SelfEfficacy and Demonstrated Knowledge (Unpublished doctoral dissertation). The Pennsylvania State University, University Park, PA.
8.3.1.2  Video Example: Difference in Exam Scores
8.3.1.2  Video Example: Difference in Exam Scores8.3.2  Hypothesis Testing
8.3.2  Hypothesis TestingBelow are the procedures for conducting a hypothesis test for two paired means. This is often referred to as a "paired means \(t\) test," "dependent means \(t\) test," or "matched pairs \(t\) test."
Data must be paired. The difference between the two groups must be normally distributed in the population or the sample size must be at least 30.
The possible combinations of null and alternative hypotheses are:
Research Question  Is the mean difference different from 0?  Is the mean difference greater than 0?  Is the mean difference less than 0? 

Null Hypothesis, \(H_{0}\)  \(\mu_d = 0 \)  \(\mu_d = 0 \)  \(\mu_d = 0 \) 
Alternative Hypothesis, \(H_{a}\)  \(\mu_d \neq 0 \)  \(\mu_d > 0 \)  \(\mu_d < 0 \) 
Type of Hypothesis Test  Twotailed, nondirectional  Righttailed, directional  Lefttailed, directional 
Where \( \mu_d \) is the hypothesized difference in the population.
The calculation of the test statistic for dependent samples is similar to the calculation you performed earlier in this lesson for a single sample mean. In this formula, \(\overline{x}_d\) is used in place of \(\overline{x}\) and \(s_d\) is used in place of \(s\):
 Test Statistic for Dependent Means

\(t=\frac{\bar{x}_d\mu_0}{\frac{s_d}{\sqrt{n}}}\)

\(\overline{x}_d\) = observed sample mean difference
\(\mu_0\) = mean difference specified in the null hypothesis
\(s_d\) = standard deviation of the differences
\(n\) = sample size (i.e., number of unique individuals)
 Observed Sample Mean Difference
 \(\overline{x}_d=\frac{\Sigma{x}_d}{n}\)
 \(x_d\) = observed difference
 Standard Deviation of the Differences
 \(s_d=\sqrt{\frac{\sum (x_d\overline{x}_d)^{2}}{n1}}\)
When testing hypotheses about a mean difference, a \(t\) distribution is used to find the \(p\) value. The degrees of freedom are equal to \(n1\) where \(n\) is the number of pairs.
If \(p \leq \alpha\) reject the null hypothesis. If \(p>\alpha\) fail to reject the null hypothesis.
Based on your decision in Step 4, write a conclusion in terms of the original research question.
8.3.2.1  Example: Quiz Scores
8.3.2.1  Example: Quiz ScoresBelow is an example of conducting a paired means \(t\) test by hand using raw data. Next, you will learn how this can be conducted most efficiently in Minitab Express.
Research question: Are scores on two quizzes different?
Data were collected from 9 students and a paired means \(t\) test was performed using hand calculations:
Student ID  Quiz 1  Quiz 2 

001  98  94 
002  100  98 
003  95  98 
004  90  88 
005  90  89 
006  92  91 
007  80  84 
008  78  80 
009  88  88 
There are two assumptions: (1) data are paired and (2) distribution of differences is normally distribution in the population or the sample size is at least 30. The data are paired because for each student we have a quiz 1 and a quiz 2 score. We do not know if the differences are normally distributed in the population and the sample size is small, but in the video above we created a histogram of the differences and found that the sample was approximately normally distributed, so this assumption has been met and we can perform a paired means \(t\) test.
Given \(\mu_d = \mu_1  \mu_2\), our hypotheses are:
\(H_0: \mu_d = 0\)
\(H_a: \mu_d \ne 0\)
 Test Statistic for Dependent Means

\(t=\frac{\bar{x}_d\mu_0}{\frac{s_d}{\sqrt{n}}}\)
\(\overline{x}_d\) = observed sample mean difference
\(\mu_0\) = mean difference specified in the null hypothesis
\(s_d\) = standard deviation of the differences
\(n\) = sample size (i.e., number of unique individuals)
Student ID  Quiz 1  Quiz 2  Difference (\(X_d\))  \(X_d  \overline{X}_d\)  \((X_d  \overline{X}_d)^2\) 

001  98  94  4  3.889  15.123 
002  100  98  2  1.889  3.568 
003  95  98  3  3.111  9.679 
004  90  88  2  1.889  3.568 
005  90  89  1  0.889  0.790 
006  92  91  1  0.889  0.790 
007  80  84  4  4.111  16.901 
008  78  80  2  2.111  4.457 
009  88  88  0  0.111  0.012 
Mean of the differences: \(\overline{X}_d=\frac{\Sigma{X}_d}{n}=\frac{1}{9}\)
For a review of computing standard deviation, see Lesson 2.
Sum of squares: \(\Sigma (X_d  \overline{X}_d)^2 = 54.889\)
Standard deviation of the differences: \(s_d=\sqrt{\frac{\sum (X_d\overline{X}_d)^{2}}{n1}} = \sqrt{\frac{54.889}{91}}=2.619\)
Test statistic: \(t=\frac{\overline{X}_d \mu_0}{\frac{s_d}{\sqrt{n}}}=\frac{\frac{1}{9}}{\frac{2.619}{\sqrt{9}}}=0.127\)
\(df=n1=91=8\)
We can construct a \(t\) distribution with 8 degrees of freedom and determine what proportion of the curve falls beyond a \(t\) score of 0.127. This is a twotailed test, so we need to take into account both the left and right sides of the curve.
\(p=0.4510+0.4510=0.9020\)
We will compare our \(p\)value from step 3 to a standard alpha level of 0.05.
Because \(p>\alpha\), we fail to reject the null hypothesis.
There is not sufficient evidence to state that scores on the two quizzes are different.
8.3.3  Minitab Express: Paired Means Test
8.3.3  Minitab Express: Paired Means TestThe steps for constructing a confidence interval or conducting a paired means \(t\) in Minitab Express are identical. The output that procedure provides includes both the confidence interval and the \(p\)value for determining statistical significance.
MinitabExpress – Conducting a Paired Means Test
Let's compare students' SATMath scores to their SATVerbal scores.
 Open the Minitab Express file:
 On a PC: In the menu bar select STATISTICS > Two Samples > Paired t
 On a Mac: In the menu bar select Statistics > 2Sample Inference > Paired t
 Double click the variable SATM in the box on the left to insert the variable into the Sample 1 box
 Double click the variable SATV in the box on the left to insert the variable into the Sample 2 box
 Click OK
This should result in the following output:
Sample  N  Mean  StDev  SE Mean 

SATM  215  599.814  84.700  5.776 
SATV  215  580.326  82.436  5.622 
Mean  StDev  SE Mean  95% CI for \(\mu_d\) 

19.488  89.808  6.125  (7.416, 31.561) 
\(\mu_d\) : mean of (SATM  SATV)
Null hypothesis  H_{0}: \(\mu_d\) = 0 

Alternative hypothesis  H_{1}: \(\mu_d\) ≠ 0 
TValue  PValue 

3.18  0.0017 
Select your operating system below to see a stepbystep guide for this example.
On the next page, the fivestep hypothesis testing procedure is used to interpret this output.
8.3.3.1  Example: SAT Scores
8.3.3.1  Example: SAT ScoresExample: SAT Scores
This example uses the dataset from Lesson 8.3.3 to walk through the fivestep hypothesis testing procedure using the Minitab Express output.
Research question: Do students score differently on the SATMath and SATVerbal tests?
Because the sample size is large (\(n \ge 30\)), the t distribution may be used to approximate the sampling distribution.
\(H_{0}:\mu_d=0\)
\(H_{a}:\mu_d \ne 0\)
Null hypothesis  H_{0}: \(\mu_d\) = 0 

Alternative hypothesis  H_{1}: \(\mu_d\) ≠ 0 
TValue  PValue 

3.18  0.0017 
The t test statistic is 3.18.
From the output, the p value is 0.0017
\(p\leq .05\), therefore our decision is to reject the null hypothesis
There is evidence that in the population, on average, students' SATMath and their SATVerbal scores are different.
8.3.3.2  Video Example: Marriage Age (Summarized Data)
8.3.3.2  Video Example: Marriage Age (Summarized Data)In a sample of 105 married heterosexual couples, the average age difference (husband's age  wife's age) was 2.829 years with a standard deviation of 4.995 years. These summary statistics were taken from a data set from the Lock^{5} textbook. Is there evidence that, on average, in the population, husbands tend to be older than their wives?