10.2.2 - The ANOVA Table

In this section, we present the Analysis of Variance Table for a completely randomized design, such as the tar content example.

Data Table

Random samples of size \(n_1, …, n_t\) are drawn from the respective \(t\) populations. The data would have the following format:

Population

Data

Mean

1

\(y_{11}\)

\(y_{12}\)

...

\(y_{1n_1}\)

\(\bar{y}_{1.}\)

2

\(y_{21}\)

\(y_{22}\)

...

\(y_{2n_2}\)

\(\bar{y}_{2.}\)

\(t\)

\(y_{t1}\)

\(y_{t2}\)

...

\(y_{tn_t}\)

\(\bar{y}_{t.}\)

Notation

\(t\): The total number of groups

\(y_{ij}\): The \(j^{th}\) observation from the \(i^{th}\) population.

\(n_i\): The sample size from the \(i^{th}\) population.

\(n_T\): The total sample size: \(n_T=\sum_{i=1}^t n_i\).

\(\bar{y}_{i.}\): The mean of the sample from the \(i^{th}\) population.

\(\bar{y}_{..}\): The mean of the combined data. Also called the overall mean.

Recall that we want to examine the between group variation and the within group variation. We can find an estimate of the variations with the following:

Sum of Squares for Treatment or the Between Group Sum of Squares
\(\text{SST}=\sum_{i=1}^t n_i(\bar{y}_{i.}-\bar{y}_{..})^2\)
Sum of Squares for Error or the Within Group Sum of Squares
\(\text{SSE}=\sum_{i, j} (y_{ij}-\bar{y}_{i.})^2\)
Total Sum of Squares
\(\text{TSS}=\sum_{i,j} (y_{ij}-\bar{y}_{..})^2\)

It can be derived that \(\text{TSS } = \text{ SST } + \text{ SSE}\).

We can set up the ANOVA table to help us find the F-statistic. Hover over the light bulb to get more information on that item.

The ANOVA Table

Source

Df

SS

MS

F

P-value

Treatment

\(t-1\)

\(\text{SST}\)

\(\text{MST}=\dfrac{\text{SST}}{t-1}\)

\(\dfrac{\text{MST}}{\text{MSE}}\)

Error

\(n_T-t\)

\(\text{SSE}\)

\(\text{MSE}=\dfrac{\text{SSE}}{n_T-t}\)

   

Total

\(n_T-1\)

\(\text{TSS}\)

     

The p-value is found using the F-statistic and the F-distribution. We will not ask you to find the p-value for this test. You will only need to know how to interpret it. If the p-value is less than our predetermined significance level, we will reject the null hypothesis that all the means are equal.

The ANOVA table can easily be obtained by statistical software and hand computation of such quantities are very tedious.