9.2.2 - Hypothesis Testing

The formula for the test statistic follows the same general format as the others that we have seen this week:

Test Statistic: \(test\; statistic = \dfrac{sample \; statistic - null\;parameter}{standard \;error}\)

Minitab will compute the test statistic for you! You will just need to determine if equal variances should be assumed or not. There is one example below walking through these procedures by hand, but you are strongly encouraged to use Minitab whenever possible.

1. Check any necessary assumptions and write null and alternative hypotheses.

There are two assumptions: (1) the two samples are independent and (2) both populations are normally distributed or \(n_1 \geq 30\) and \(n_2 \geq 30\). If the second assumption is not met then you can conduct a randomization test.

Below are the possible null and alternative hypothesis pairs:

Research Question	Are the means of group 1 and group 2 different?	Is the mean of group 1 greater than the mean of group 2?	Is the mean of group 1 less than the mean of group 2?
Null Hypothesis, \(H_{0}\)	\(\mu_1 = \mu_2\)	\(\mu_1 = \mu_2\)	\(\mu_1 = \mu_2\)
Alternative Hypothesis, \(H_{a}\)	\(\mu_1 \neq \mu_2\)	\(\mu_1 > \mu_2\)	\(\mu_1 < \mu_2\)
Type of Hypothesis Test	Two-tailed, non-directional	Right-tailed, directional	Left-tailed, directional

2. Calculate an appropriate test statistic.

Standard Error

\(\sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}}\)

Test Statistic for Independent Means

\(t=\dfrac{\bar{x}_1-\bar{x}_2}{ \sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}}\)

Estimated Degrees of Freedom

\(df=smallest\;n - 1\)

3. Determine the p-value associated with the test statistic.

The \(t\) test statistic found in Step 2 is used to determine the p-value.

4. Decide between the null and alternative hypotheses.

If \(p \leq \alpha\) reject the null hypothesis. If \(p>\alpha\) fail to reject the null hypothesis.

5. State a "real world" conclusion.

Based on your decision in Step 4, write a conclusion in terms of the original research question.

9.2.2.1 - Minitab: Independent Means t Test

Here we will use Minitab to conduct an independent means t-test. Note that Minitab uses a more complicated formula for computing the degrees of freedom for this test.

Within Minitab, the procedure for obtaining the test statistic and confidence interval for independent means is identical.

Minitab^® – Conducting an Independent Means t Test

Let's compare the mean SAT-Math scores of students who have and have not ever cheated. Both sample sizes are at least 30 so the sampling distribution can be approximated using the \(t\) distribution.

Open the Minitab file: class_survey.mpx
Select Stat > Basic Statistics > 2 Sample t...
Enter the variable SATM into the Samples box
Enter variable Ever_Cheat into the Sample IDs box
Click OK

This should result in the following output:

2-Sample t: SATM by Ever Cheat

Method

\(\mu_1\): mean of SATM when Ever_Cheat = No
\(\mu_2\): mean of SATM when Ever_Cheat = Yes
Difference: \(\mu_1-\mu_2\)

Equal variances are not assumed for this analysis.

Descriptive Statistics: SATM

Ever_Cheat	N	Mean	StDev	SE Mean
No	163	604.0	86.9	6.8
Yes	53	583.7	79.2	11

Estimation of Difference

Difference	95% CI for Difference
20.3	(-5.2, 45.8)

Test

Null hypothesis	\(H_0\): \(\mu_1-\mu_2=0\)
Alternative hypothesis	\(H_1\): \(\mu_1-\mu_2\neq0\)

T-Value	DF	P-Value
1.58	95	0.117

The result of our two independent means t test is \(t(95) = 1.58, p = 0.117\). Our p-value is greater than the standard alpha level of 0.05 so we fail to reject the null hypothesis. There is not enough evidence to state that the mean SAT-Math scores of students who have and have not ever cheated are different.

Note that we could also interpret the confidence interval in this output. We are 95% confident that the mean difference in the population is between -5.16 and 45.78.

The example above uses a dataset. The following examples show how you can conduct this type of test using summarized data.

9.2.2.1.1 - Example: Summarized Data

Example: Weight by Treatment

Research question: Do patients who receive our treatment weigh less than participants who do not receive our treatment?

Participants were randomly assigned to the treatment condition or a control group. After our intervention, their weights were measured in pounds. Weight is a quantitative variable, so we are going to be comparing means in this example. If assumptions are met, we’ll be conducting a two independent means t test.

Our treatment group has a sample size of 45, mean of 140 pounds, and standard deviation of 20 pounds. Our control group has a sample size of 40, sample mean of 150 pounds, and standard deviation of 25 pounds.

Follow the 5 step hypothesis testing procedure to analyze this data in Minitab.

1. Check any necessary assumptions and write null and alternative hypotheses.

There are two assumptions: (1) the two samples are independent and (2) both populations are normally distributed or \(n_1 \geq 30\) and \(n_2 \geq 30\). The participants were randomly assigned to one of the two groups. They are in no way matched or paired so they are independent. Both groups have sample size of at least 30.

Our hypotheses is based on the research question "Do patients who receive our treatment weigh less than participants who do not receive our treatment?." This indicates a left tail test. (T = treatment group, C = control group)

\(H_0\): \(\mu_T = \mu_C\)

\(H_a\): \(\mu_T < \mu_C\)

2. Calculate an appropriate test statistic.

Use Minitab to perform the t-test.

2-Sample independent t-test using summarized data

Open Minitab
Select Stat > Basic Statistics > 2 Sample t...
Select Summarized data in the dropdown at the top
Enter the summary statistics in the table with the treatment group as Sample 1 and the control group as Sample 2.

	Sample 1	Sample 2
Sample size:	45	40
Sample means:	140	150
Standard deviation:	20	25

Select the Options button
For the Alternative hypothesis choose Difference < hypothesized difference
OK and OK

And we get the following output:

2-Sample t: SATM by Ever Cheat

Method

\(\mu_1\): population mean of Sample 1
\(\mu_2\): population mean of Sample 2
Difference: \(\mu_1-\mu_2\)

Equal variances are not assumed for this analysis.

Descriptive Statistics

Sample	N	Mean	StDev	SE Mean
Sample 1	45	140.0	20.0	3.0
Sample 2	40	150.0	25.0	4.0

Estimation of Difference

Difference	95% CI for Difference
-10.00	-1.75

Test

Null hypothesis	\(H_0\): \(\mu_1-\mu_2=0\)
Alternative hypothesis	\(H_1\): \(\mu_1-\mu_2\lt0\)

T-Value	DF	P-Value
-2.02	74	0.024

The t-value is -2.02.

3. Determine the p-value associated with the test statistic.

The p-value is 0.024.

4. Decide between the null and alternative hypotheses.

\(p \leq \alpha\), reject the null hypothesis.

5. State a "real world" conclusion.

There is convincing evidence that patients who receive our treatment weigh less than participants who do not receive our treatment in the population.

9.2.2.1.3 - Example: Height by Sex

Research Question: In the population of all college students, is the mean height of females less than the mean height of males?

Data concerning height (in inches) were collected from 99 females and 126 males.

This example uses the following Minitab file: class_survey.csv

1. Check assumptions and write hypotheses

We have two independent groups: females and males. Height in inches is a quantitative variable. This means that we will be comparing the means of two independent groups.

There are 126 females and 99 males in our sample. The sampling distribution will be approximately normally distributed because both sample sizes are at least 30.

This is a left-tailed test because we want to know if the mean for females is less than the mean for males.

(Note: Minitab will arrange the levels of the explanatory variable in alphabetical order. This is why "females" are listed before "males" in this example.)

\(H_{0}:\mu_f = \mu_m \)
\(H_{a}: \mu_f < \mu_m \)

2. Calculate the test statistic

Open the file and select Stat > Basic Statistics > 2 Sample t...
Enter variable Height into the Samples box
Enter the variable Biological Sex in the box into the Sample IDs box
Choose Options and select 'Difference < Hypothesized difference' for the alternative hypothesis.
Click OK

This should result in the following output:

Method

\(\mu_1\): mean of Height when Biological Sex = Female
\(\mu_2\): mean of Height when Biological Sex = Male
Difference: \(\mu_1-\mu_2\)

Equal variances are not assumed for this analysis.

Descriptive Statistics: Height

Gender	N	Mean	StDev	SE Mean
Female	126	65.62	6.53	0.58
Male	99	70.24	3.63	0.37

Estimation for Difference

Difference	95% Upper Bound for Difference
-4.623	-3.488

Test

Null hypothesis	\(H_0\): \(\mu_1-\mu_2=0\)
Alternative hypothesis	\(H_1\): \(\mu_1-\mu_2<0\)

T-Value	DF	P-Value
-6.73	202	0.000

The test statistic is t = -6.73

3. Determine the p-value

From the output given in Step 2, the p-value is 0.000

4. Make a decision

\(p\leq.05\), therefore we reject the null hypothesis.

5. State a "real world" conclusion

There is convincing evidence that the mean height of female students is less than the mean height of male students in the population.

^[1]	Link
↥	Has Tooltip/Popover
	Toggleable Visibility

9.2.2 - Hypothesis Testing

9.2.2.1 - Minitab: Independent Means t Test

Minitab® – Conducting an Independent Means t Test

Method

Descriptive Statistics: SATM

Estimation of Difference

Test

9.2.2.1.1 - Example: Summarized Data

Example: Weight by Treatment

2-Sample independent t-test using summarized data

Method

Descriptive Statistics

Estimation of Difference

Test

9.2.2.1.3 - Example: Height by Sex

Method

Descriptive Statistics: Height

Estimation for Difference

Test

Minitab^® – Conducting an Independent Means t Test