10.1 - Introduction to Analysis of Variance

Let's use the following example to look at the logic behind what an analysis of variance is after.

Application: Tar Content Comparisons Section

We want to see whether the tar contents (in milligrams) for three different brands of cigarettes are different. Two different labs took samples, Lab Precise and Lab Sloppy.

Lab Precise

Lab Precise took six samples from each of the three brands and got the following measurements:

Sample Brand A Brand B Brand C
1 10.21 11.32 11.60
2 10.25 11.20 11.90
3 10.24 11.40 11.80
4 9.80 10.50 12.30
5 9.77 10.68 12.20
6 9.73 10.90 12.20
Average \(\bar{y}_1= 10.00\) \(\bar{y}_2= 11.00\) \(\bar{y}_3= 12.00\)
Lab Precise Dotplot

Dotplot of the 3 brands for lab precise

Lab Sloppy

Lab Sloppy also took six samples from each of the three brands and got the following measurements:

Sample Brand A Brand B Brand C
1 9.03 9.56 10.45
2 10.26 13.40 9.64
3 11.60 10.68 9.59
4 11.40 11.32 13.40
5 8.01 10.68 14.50
6 9.70 10.36 14.42
Average \(\bar{y}_1= 10.00\) \(\bar{y}_2= 11.00\) \(\bar{y}_3= 12.00\)
Lab Sloppy Dotplot

Dotplot of the six samples taken from each brand for the Sloppy Lab

The sample means from the two labs turned out to be the same and thus the differences in the sample means from the two labs are zero.

From which data set can you draw more conclusive evidence that the means from the three populations are different?

We need to compare the between-sample-variation to the within-sample-variation. Since the between-sample-variation from Lab Sloppy is large compared to the within-sample-variation for data from Lab Precise, we will be more inclined to conclude that the three population means are different using the data from Lab Precise. Since such analysis is based on the analysis of variances for the data set, we call this statistical method the Analysis of Variance (or ANOVA).