10.2 - Hypothesis Testing

A one-way between groups ANOVA is used to compare the means of more than two independent groups. A one-way between groups ANOVA comparing just two groups will give you the same results at the independent \(t\) test that you conducted in Lesson 8. We will use the five step hypothesis testing procedure again in this lesson.

1. Check assumptions and write hypotheses

The assumptions for a one-way between groups ANOVA are:

  1. Samples are independent
  2. The response variable is approximately normally distributed for each group or all group sample sizes are at least 30
  3. The population variances are equal across responses for the group levels (if the largest sample standard deviation divided by the smallest sample standard deviation is not greater than two, then assume that the population variances are equal)

Given that you are comparing \(k\) independent groups, the null and alternative hypotheses are:

\(H_{0}: \mu_1 = \mu_2 = \cdots = \mu_k\)
\(H_{a}:\) Not all \(\mu_\cdot\) are equal

In other words, the null hypothesis is that at all of the groups' population means are equal. The alternative is that they are not all equal; there are at least two population means that are not equal to one another.

2. Calculate the test statistic

ANOVA uses an F test statistic. Hand calculations for ANOVAs require many steps. In this class, you will be working primarily with Minitab outputs.

Conceptually, the F statistic is a ratio: \(F=\frac{Between\;groups\;variability}{Within\;groups\;variability}\). Numerically this translates to \(F=\frac{MS_{Between}}{MS_{Within}}\). In other words how much do individuals in different groups vary from one another over how much to individuals within groups vary from one another.

Statistical software will compute the F ratio for you and produce what is known as an ANOVA source table. The ANOVA source table will give you information about the variability between groups and within groups. The table below gives you all of the formulas, but you will not be responsible for performing these calculations by hand. Minitab will do all of these calculations for you and provide you with the full ANOVA source table.

Source df SS MS F p
Between Groups (Factor) \(k-1\) \(\sum_{k}n_k(\overline{x}_k-\overline{x}_\cdot)^2\) \(\dfrac{SS_{Between}}{df_{Between}}\) \(\dfrac{MS_{Between}}{MS_{Within}}\) Area to the right of Fk-1, n-k
Within Groups (Error) \(n-k\) \(\sum_k \sum_i(x_{ik}-\overline{x}_k)^2\) \(\dfrac{SS_{Within}}{df_{Within}}\)    
Total \(n-1\) \(\sum_k \sum_i(x_{ik}-\overline{x}_\cdot)^2\)      
Legend
\(k\) Number of groups
\(n\) Total sample size (all groups combined)
\(n_k\) Sample size of group \(k\)
\(\overline{x}_k\) Sample mean of group \(k\)
\(\overline{x}_\cdot\) Grand mean (i.e., mean for all groups combined)
SS Sum of squares
MS Mean square
df Degrees of freedom
F F-ratio (the test statistic)

Some of the terms in the table above should look familiar, while others will be new to you. The sum of squares that appears in the ANOVA source table is similar to the sum of squares that you computed in Lesson 2 when computing variance and standard deviation. Recall, the sum of squares is the squared difference between each score and the mean. Here, there are three different sum of squares each measuring a different type of variability.

The ANOVA source table also has three different degrees of freedom: \(df_{between}\), \(df_{within}\), and \(df_{total}\). If you were to look up an F value using statistical software you would need to know two of these degrees of freedom: \(df_1 = df_{between}\) and \(df_2=df_{within}\).

3. Determine the p-value

When performing an  ANOVA using statistical software, you will be given the p-value in the ANOVA source table. If performing an ANOVA by hand, you would use the F distribution. Similar to the t distribution, the F distribution varies depending on degrees of freedom.

4. Make a decision

If \(p \leq \alpha\) reject the null hypothesis. If \(p>\alpha\) fail to reject the null hypothesis.

5. State a "real world" conclusion.

Based on your decision in Step 4, write a conclusion in terms of the original research question.