1: Overview of ANOVA

Overview

In previous statistics courses, analysis of variance (ANOVA) has been applied in very simple settings, mainly involving one group or factor as the explanatory variable. In this course, ANOVA models are extended to more complex situations involving several explanatory variables. The experimental design aspects are discussed as well. Even though the ANOVA methodology developed in the course is for data obtained from designed experimental settings, the same methods may be used to analyze data from observational studies as well. However, let us keep in mind that the conclusions made may not be as sound because observational studies do not satisfy the rigorous conditions that the designed experiments are subjected to.

Note: If you aren't familiar with the difference between observational and experimental studies you should be reviewing introductory statistical concepts which are essential to success in this course.

"Classic" analysis of variance is a method to compare average (mean) responses to experimental manipulations in controlled environments. For example, if people who want to lose weight are randomly selected to participate in a weight-loss study, each person might be randomly assigned to a dieting group, an exercise group, and a "control" group (for which there is no intervention). The mean weight loss for each group is compared to every other group.

Recall that a fundamental tenet of the scientific method is that results should be reproducible. A designed experiment provides this through replication and generates data that requires the calculation of mean (average) responses.

Objectives

Upon completion of this lesson, you should be able to:

Become familiar with the standard ANOVA basics.
Apply the Exploratory Data Analysis (EDA) basics for ANOVA appropriate data.

1.1 - The Working Hypothesis

Using the scientific method, before any statistical analysis can be conducted, a researcher must generate a guess, or hypothesis about what is going on. The process begins with a Working Hypothesis. This is a direct statement of the research idea. For example, a plant biologist may think that plant height may be affected by applying different fertilizers. So they might say: "Plants with different fertilizers will grow to different heights".

According to the Popperian Principle of Falsification, we can't conclusively affirm a hypothesis, but we can conclusively negate a hypothesis. So we need to translate the working hypothesis into a framework wherein we state a null hypothesis that the average height (or mean height) for plants with the different fertilizers will all be the same. The alternative hypothesis (which the biologist hopes to show) is that they are not all equal, but rather some of the fertilizer treatments have produced plants with different mean heights. The strength of the data will determine whether the null hypothesis can be rejected with a specified level of confidence.

We can imagine testing 4 groups of plants, three with three different kinds of fertilizer and the fourth untreated (a control group). Assuming the plant biologist kept all the plants under controlled conditions in the greenhouse, the effect of the fertilizer would be the only thing to differ among the groups of plants. Suppose at the end of the experiment, the biologist measured the height of each plant. A simple boxplot can then be used to illustrate the difference in the heights between the four groups, seen in the figure below. Plant height, the dependent or response variable, is seen on the vertical (y) axis versus fertilizer, the independent or explanatory variable, seen on the horizontal (x) axis.

This boxplot is a customary way to show treatment (or factor) level differences. In this case, there was only one treatment: fertilizer. The fertilizer treatment had four levels that included the control, which received no fertilizer, and the three different fertilizers. Understanding this language convention is essential as later in the course we will be using ANOVA to handle multi-factor studies (for example if the biologist manipulated the amount of water AND the type of fertilizer) and we will need to be able to refer to different treatments, each with their own set of levels.

Another alternative for viewing the differences in the heights is with a 'means plot' (a scatter or interval plot):

This second plotting method for the differences in the treatment means provides essentially the same information. However, this plot illustrates the variability in the data with "error bars" that are the 95% confidence interval limits around the means. Between the statement of a Working Hypothesis and the creation of these 95% confidence intervals is a 7-step process of statistical hypothesis testing, presented in the following section.

1.2 - The 7 Step Process of Statistical Hypothesis Testing

We will cover the seven steps one by one.

Step 1: State the Null Hypothesis

The null hypothesis can be thought of as the opposite of the "guess" the researchers made. In the example presented in the previous section, the biologist "guesses" plant height will be different for the various fertilizers. So the null hypothesis would be that there will be no difference among the groups of plants. Specifically, in more statistical language the null for an ANOVA is that the means are the same. We state the null hypothesis as:

\(H_0 \colon \mu_1 = \mu_2 = ⋯ = \mu_T\)

for T levels of an experimental treatment.

Note: Why do we do this? Why not simply test the working hypothesis directly? The answer lies in the Popperian Principle of Falsification. Karl Popper (a philosopher) discovered that we can’t conclusively confirm a hypothesis, but we can conclusively negate one. So we set up a null hypothesis which is effectively the opposite of the working hypothesis. The hope is that based on the strength of the data we will be able to negate or reject the null hypothesis and accept an alternative hypothesis. In other words, we usually see the working hypothesis in \(H_A\).
Step 2: State the Alternative Hypothesis

\(H_A \colon \text{ treatment level means not all equal}\)

The alternative hypothesis is stated in this way so that if the null is rejected, there are many alternative possibilities.

For example, \(\mu_1\ne \mu_2 = ⋯ = \mu_T\) is one possibility, as is \(\mu_1=\mu_2\ne\mu_3= ⋯ =\mu_T\). Many people make the mistake of stating the alternative hypothesis as \(\mu_1\ne\mu_2\ne⋯\ne\mu_T\) which says that every mean differs from every other mean. This is a possibility, but only one of many possibilities. A simple way of thinking about this is that at least one mean is different from all others. To cover all alternative outcomes, we resort to a verbal statement of "not all equal" and then follow up with mean comparisons to find out where differences among means exist. In our example, a possible outcome would be that fertilizer 1 results in plants that are exceptionally tall, but fertilizers 2, 3, and the control group may not differ from one another.

Step 3: Set \(\alpha\)

If we look at what can happen in a hypothesis test, we can construct the following contingency table:

Decision	In Reality
Decision	\(H_0\) is TRUE	\(H_0\) is FALSE
Accept \(H_0\)	correct	Type II Error \(\beta\) = probability of Type II Error
Reject \(H_0\)	Type I Error \(\alpha\) = probability of Type I Error	correct

You should be familiar with Type I and Type II errors from your introductory courses. It is important to note that we want to set \(\alpha\) before the experiment (a-priori) because the Type I error is the more grievous error to make. The typical value of \(\alpha\) is 0.05, establishing a 95% confidence level. For this course, we will assume \(\alpha\) =0.05, unless stated otherwise.

Step 4: Collect Data

Remember the importance of recognizing whether data is collected through an experimental design or observational study.
Step 5: Calculate a test statistic

For categorical treatment level means, we use an F-statistic, named after R.A. Fisher. We will explore the mechanics of computing the F-statistic beginning in Lesson 2. The F-value we get from the data is labeled \(F_{\text{calculated}}\).
Step 6: Construct Acceptance / Rejection regions

As with all other test statistics, a threshold (critical) value of F is established. This F-value can be obtained from statistical tables or software and is referred to as \(F_{\text{critical}}\) or \(F_\alpha\). As a reminder, this critical value is the minimum value of the test statistic (in this case \(F_{\text{calculated}}\)) for us to reject the null.

The F-distribution, \(F_\alpha\), and the location of acceptance/rejection regions are shown in the graph below:
Step 7: Based on Steps 5 and 6, draw a conclusion about \(H_0\)

If \(F_{\text{calculated}}\) is larger than \(F_\alpha\), then you are in the rejection region and you can reject the null hypothesis with \(\left(1-\alpha \right)\) level of confidence.

Note that modern statistical software condenses Steps 6 and 7 by providing a p-value. The p-value here is the probability of getting an \(F_{\text{calculated}}\) even greater than what you observe assuming the null hypothesis is true. If by chance, the \(F_{\text{calculated}} = F_\alpha\), then the p-value would be exactly equal to \(\alpha\). With larger \(F_{\text{calculated}}\) values, we move further into the rejection region and the p-value becomes less than \(\alpha\). So, the decision rule is as follows:

If the p-value obtained from the ANOVA is less than \(\alpha\), then reject \(H_0\) in favor of \(H_A\).

Note: If you are not familiar with this material, we suggest you return to course materials from your basic statistics course.

1.3 - Lesson 1 Summary

This lesson reinforced the basics of ANOVA which you may have seen in other courses. Using the greenhouse example, the seven important steps of hypothesis testing in a single factor ANOVA setting were explored. Step 2 highlighted the correct way to state and interpret the alternative hypothesis \((H_A)\), while Step 3 discusses the Truth Table that includes possible errors in hypothesis testing. Step 6 discusses in detail the rejection region of the null hypothesis (\(H_0\)).

The lesson also introduced us to some basics in ANOVA-related exploratory data analysis (EDA). The graphics such as side-by-side boxplots and mean plots are useful tools in producing a visual summary of the raw data and ANOVA results. These will serve as stepping stones to more elaborate graphical techniques we will learn throughout the course.

The straightforward concepts and methodology learned in this lesson will help us navigate more complex topics addressed in future lessons. The keywords and phrases learned in this lesson were: null and alternative hypotheses (\(H_0\) and \(H_A\)), Type I and Type II errors, significance level (\(\alpha\)), rejection region, F statistic, and its critical and calculated values.

^[1]	Link
↥	Has Tooltip/Popover
	Toggleable Visibility

1: Overview of ANOVA

Overview

Objectives

1.1 - The Working Hypothesis

1.2 - The 7 Step Process of Statistical Hypothesis Testing

Step 1: State the Null Hypothesis

Step 2: State the Alternative Hypothesis

Step 3: Set \(\alpha\)

Step 4: Collect Data

Step 5: Calculate a test statistic

Step 6: Construct Acceptance / Rejection regions

Step 7: Based on Steps 5 and 6, draw a conclusion about \(H_0\)

1.3 - Lesson 1 Summary