As we work with ANOVA it will become more and more apparent that we start with the variability in the response variable and divide or "partition" that total variability into different parts.We simply partition the total variability in the response into variability that is due to our treatment (which of course we hope is significantly large) and variability in the response that is left over (you can think about this as the nuisance or "error" or "residual" variability). The help you imagine this a bit more, think about the data storage capacity of a computer. If you have 8GB of storage total, you can ask your computer to show the types of files that are occupying the storage. The ANOVA model is (in a very elementary fashion) going to compare the variability due to the treatment to the variability left over.
From elementary statistics, when we think of computing a variance of a random variable (say X), we use the expression:
\(\text{variance }=\dfrac{\sum(X_i\bar{X})^2}{n1}=\dfrac{SS}{df}\)
The numerator of this expression is referred to as the Sum of Squares, or Sum of Squared deviations from the mean, or simply SS. (If you don't recognize this then we suggest your sharpen your introductory statistics skills!) The denominator is the degrees of freedom, (n  1), or df.
The ANOVA table is set up to generate quantities analogous to the simple variance calculation above.

We start by considering the TOTAL variability in the response variable. This is done by calculating the SS_{Total}
\(\text{Total SS }=\sum_{i}\sum_{j}(Y_{ij}\bar{Y}_{..})^2\)
= 312.47.
The degrees of freedom for the Total SS is N  1 = 24 – 1 = 23.

Our next step determines how much of the variability in Y is accounted for by our treatment. We now calculate SS_{Treatment} or SS_{trt}:
\(\text{Treatment SS }=\sum_{i}n_i(\bar{Y_{i.}}\bar{Y}_{..})^2\)
Note! the sums of squares for the treatment is the deviation of the group mean from the grand mean. So in some sense we are "aggregating" all of the responses from that group and representing the "group effect" as the group meanand for our example:
\begin{aligned}\text{SSTrt }= 6*(21.026.1667)^2 + 6*(28.626.1667)^2 +\\ ...+6*(25.866726.1667)^2 + 6*(29.226.1667)^2 = 251.44\end{aligned}
Note that in this case we have equal numbers of observations (6) per treatment level, and it is, therefore, a balanced ANOVA.

Finally, we need to determine how much variability is "left over". This is the Error or Residual sums of squares by subtraction:
\(\text{Error SS }=\sum_{i}\sum_{j}(Y_{ij}\bar{Y}_{i.})^2 = \text{ Total SS  Treatment SS}\)
\(SS_{Error} = 312.47 – 251.44 = \mathbf{61.033}\)
Note here that the "left over" is really the deviation of any score from its group mean
We can now fill in the table:
Source  df  SS  MS  F 

Treatment  k  1 = 3  251.44  
Error  233=20  61.033  
Total  N  1 =23  312.47 
We have k treatment levels and so we use k  1 for the df for the Treatment. In our example there are 4 treatment levels (the control and the 3 fertilizers) so k = 4 and k  1 = 4  1 or 3. Finally, we obtain the Error df by subtraction as we did with the SS.
The Mean Squares (MS) can now be calculated as:
\(MS_{Trt}=\dfrac{SS_{Trt}}{df_{Trt}}=\dfrac{251.44}{3}=83.813\)
and
\(MS_{Error}=\dfrac{SS_{Error}}{df_{Error}}=\dfrac{61.033}{20}=3.052\)
(We don’t need to calculate the \(MS_{\text{Total}}\) .)
Source  df  SS  MS  F 
Treatment  3  251.44  83.813  
Error  20  61.033  3.052  
Total  23  312.47 
Finally, we can compute the F statistic for our ANOVA. Conceptually we are comparing the ratio of the variability due to our treatment (remember we want this to be relatively large) to the variability left over, or due to error (and of course since this is an error we want this to be small). Following this logic, we want our F to be a large number. If we go back and think about the computer storage space we can picture most of the storage space taken up by our treatment, and less of it taken up by error. In our example, the F is calculated as:
\(F=\dfrac{MS_{Trt}}{MS_{Error}}=\dfrac{83.813}{3.052}=27.46\)
Source  df  SS  MS  F 
Treatment  3  251.44  83.813  27.46 
Error  20  61.033  3.052  
Total  23  312.47 
So how do we know if the F is large enough to say we have a significant amount of variability due to our treatment? We look up the critical value of F and compare it to the value we calculated. Specifically the critical F is \(F_\alpha = F_{(0.05, 3,20)} = 3.10\) from a table
The table in the text actually indexes this value as \(1  \alpha = .95\)
The \(F_{\text{calculate}}\) > \(F_\alpha\) so we Reject \(H_0\) and accept the alternative \(H_A\). The pvalue (which we don't typically calculate by hand) is the area under the curve to the right of the \(F_{\text{calculate}}\) and is the way the process is reported in statistical software. Note that in the unlikely event that the \(F_{\text{calculate}}\) is exactly equal to the \(F_{\alpha}\) then the \(\text{pvalue} = \alpha\). As the calculated F statistic increases beyond the \(F_{\alpha}\) and we go further into the Rejection region, the area under the curve (hence the pvalue) gets smaller and smaller. This leads us to the decisions rule: If the pvalue is \(<\alpha\) then we Reject \(H_0\).
If this seems a bit confusing, think about the magnitude of the F needing to be large because we want the numerator to be larger than the denominator. When it is we conclude that the group effect is large so we reject the hypothesis that all the groups are the same. Since the pvalue represents the probability of getting a \(F_{\text{calculate}}\) that is larger than what you actually observed, a large F provides evidence that the null is NOT true, hence small pvalues (less than .05 in this case) lead us to reject the null.