5.1 - Factorial Designs with Two Treatment Factors

For now we will just consider two treatment factors of interest. It looks almost the same as the randomized block design model only now we are including an interaction term:

\(Y_{ijk} = \mu + \alpha_i + \beta_j + (\alpha\beta)_{ij} + e_{ijk}\)

where \(i = 1, \dots, a, j = 1, \dots, b, \text{ and } k = 1, \dots, n\). Thus we have two factors in a factorial structure with n observations per cell. As usual, we assume the \(e_{ijk} ∼ N(0, \sigma^2)\), i.e. independently and identically distributed with the normal distribution. Although it looks like a multiplication, the interaction term need not imply multiplicative interaction.

The Effects Model vs. the Means Model Section

The cell means model is written:

\(Y_{ijk}=\mu_{ij} + e_{ijk}\)

Here the cell means are: \(\mu_{11}, \dots , \mu_{1b}, \dots , \mu_{a1} \dots \mu_{ab}\). Therefore we have a × b cell means, μij. We will define our marginal means as the simple average over our cell means as shown below:

\(\bar{\mu}_{i.}=\frac{1}{b} \sum\limits_j \mu_{ij}\), \(\bar{\mu}_{.j}=\frac{1}{a} \sum\limits_i \mu_{ij}\)

From the cell means structure we can talk about marginal means and row and column means. But first we want to look at the effects model and define more carefully what the interactions are We can write the cell means in terms of the full effects model:

\(\mu_{ij} = \mu + \alpha_i + \beta_j + (\alpha\beta)_{ij}\)

It follows that the interaction terms \((\alpha \beta)_{ij}\)are defined as the difference between our cell means and the additive portion of the model:

\((\alpha\beta)_{ij} = \mu_{ij} - (\mu + \alpha_i + \beta_j) \)

If the true model structure is additive then the interaction terms\((\alpha \beta)_{ij}\) are equal to zero. Then we can say that the true cell means, \(\mu_{ij} = (\mu + \alpha_i + \beta_j)\), have additive structure.

Example 1 Section

Let's illustrate this by considering the true means \(\mu_{ij} \colon\)

A     1 2 \(\bar{\mu}_{i.}\) \(\alpha_i\)
  1 5 11 8 -2
  2 9 15 12 2
  \(\bar{\mu}_{.j}\) 7 13 10  
  \(\beta_j\) -3 3    

Note that both a and b are 2, thus our marginal row means are 8 and 12, and our marginal column means are 7 and 13. Next, let's calculate the \(\alpha\) and the \(\beta\) effects; since the overall mean is 10, our \(\alpha\) effects are -2 and 2 (which sum to 0), and our \(\beta\) effects are -3 and 3 (which also sum to 0). If you plot the cell means you get two lines that are parallel.


The difference between the two means at the first \(\beta\) factor level is 9 - 5 = 4. The difference between the means for the second \(\beta\) factor level is 15 - 11 = 4. We can say that the effect of \(\alpha\) at the first level of \(\beta\) is the same as the effect of \(\alpha\) at the second level of \(\beta\). Therefore we say that there is no interaction and as we will see the interaction terms are equal to 0.

This example simply illustrates that the cell means, in this case, have additive structure. A problem with data that we actually look at is that you do not know in advance whether the effects are additive or not. Because of random error, the interaction terms are seldom exactly zero. You may be involved in a situation that is either additive or non-additive, and the first task is to decide between them.

Now consider the non-additive case. We illustrate this with Example 2 which follows.

Example 5.2 Section

This example was constructed so that the marginal means and the overall means are the same as in Example 1. However, it does not have additive structure.


Using the definition of interaction:

\((\alpha \beta)_{ij} = \mu_{ij} - (\mu + \alpha_i + \beta_j)\)

which gives us \((\alpha \beta)_{ij}\) interaction terms that are -2, 2, 2, -2. Again, by the definition of our interaction effects, these \((\alpha \beta)_{ij}\) terms should sum to zero in both directions.


We generally call the \(\alpha_i\) terms the treatment effects for treatment factor A and the \(\beta_j\) terms for treatment factor B, and the \((\alpha \beta)_{ij}\) terms the interaction effects.

The model we have written gives us a way to represent in a mathematical form a two-factor design, whether we use the means model or the effects model, i.e.,

\(Y_{ijk} = \mu_{ij} + e_{ijk}\)


\(Y_{ijk} = \mu + \alpha_i + \beta_j + (\alpha\beta)_{ij} + e_{ijk}\)

There is really no benefit to the effects model when there is interaction, except that it gives us a mechanism for partitioning the variation due to the two treatments and their interactions. Both models have the same number of distinct parameters. However, when there is no interaction then we can remove the interaction terms from the model and use the reduced additive model.

Now, we'll take a look at the strategy for deciding whether our model fits, whether the assumptions are satisfied and then decide whether we can go forward with an interaction model or an additive model. This is the first decision. When you can eliminate the interactions because they are not significantly different from zero, then you can use the simpler additive model. This should be the goal whenever possible because then you have fewer parameters to estimate, and a simpler structure to represent the underlying scientific process.

Before we get to the analysis, however, we want to introduce another definition of effects - rather than defining the \(\alpha_i\) effects as deviation from the mean, we can look at the difference between the high and the low levels of factor A. These are two different definitions of effects that will be introduced and discussed in this chapter and the next, the \(\alpha_i\) effects and the difference between the high and low levels, which we will generally denote as the A effect.

Factorial Designs with 2 Treatment Factors, cont'd Section

For a completely randomized design, which is what we discussed for the one-way ANOVA, we need to have n × a × b = N total experimental units available. We randomly assign n of those experimental units to each of the a × b treatment combinations. For the moment we will only consider the model with fixed effects and constant experimental random error.

The model is:

\(Y_{ijk} = \mu + \alpha_i + \beta_j + (\alpha\beta)_{ij} + e_{ijk}\)

\(i = 1, \dots , a\)
\(j = 1, \dots , b\)
\(k = 1, \dots , n\)

Read the text section 5.3.2 for the definitions of the means and the sum of squares.

Testing Hypotheses Section

We can test the hypotheses that the marginal means are all equal, or in terms of the definition of our effects that the \(\alpha_i\)'s are all equal to zero, and the hypothesis that the \(\beta_j\)'s are all equal to zero. And, we can test the hypothesis that the interaction effects are all equal to zero. The alternative hypotheses are that at least one of those effects is not equal to zero.

How we do this, in what order, and how do we interpret these tests?

One of the purposes of a factorial design is to be efficient about estimating and testing factors A and B in a single experiment. Often we are primarily interested in the main effects. Sometimes, we are also interested in knowing whether the factors interact. In either case, the first test we should do is the test on the interaction effects.

The Test of H0: \((\alpha\beta)_{ij}=0\) Section

If there is interaction and it is significant, i.e. the p-value is less than your chosen cut off, then what do we do? If the interaction term is significant that tells us that the effect of A is different at each level of B. Or you can say it the other way, the effect of B differs at each level of A. Therefore, when we have significant interaction, it is not very sensible to even be talking about the main effect of A and B, because these change depending on the level of the other factor. If the interaction is significant then we want to estimate and focus our attention on the cell means. If the interaction is not significant, then we can test the main effects and focus on the main effect means.

The estimates of the interaction and main effects are given in the text in section 5.3.4.

Note that the estimates of the marginal means for A are the marginal means:

\(\bar{y}_{i..}=\dfrac{1}{bn} \sum\limits_j \sum\limits_k y_{ijk}\), with \(var(\bar{y}_{i..})=\dfrac{\sigma^2}{bn}\)

A similar formula holds for factor B, with


Just the form of these variances tells us something about the efficiency of the two-factor design. A benefit of a two factor design is that the marginal means have n × b number of replicates for factor A and n × a for factor B. The factorial structure, when you do not have interactions, gives us the efficiency benefit of having additional replication, the number of observations per cell times the number of levels of the other factor. This benefit arises from factorial experiments rather than single factor experiments with n observations per cell. An alternative design choice could have been to do two one-way experiments, one with a treatments and the other with b treatments, with n observations per cell. However, these two experiments would not have provided the same level of precision, nor the ability to test for interactions.

Another practical question: If the interaction test is not significant what should we do?

Do we get remove the interaction term in the model? You might consider dropping that term from the model. If n is very small and your df for error are small, then this may be a critical issue. There is a 'rule of thumb' that I sometimes use in these cases. If the p-value for the interaction test is greater than 0.25 then you can drop the interaction term. This is not an exact cut off but a general rule. Remember, if you drop the interaction term, then a variation accounted for by SSab would become part of the error and increasing the SSE, however your error df would also become larger in some cases enough to increase the power of the tests for the main effects. Statistical theory shows that in general dropping the interaction term increases your false rejection rate for subsequent tests. Hence we usually do not drop nonsignificant terms when there are adequate sample sizes. However, if we are doing an independent experiment with the same factors we might not include interaction in the model for that experiment.

What if n = 1, and we have only 1 observation per cell? If n = 1 then we have 0 df for SSerror and we cannot estimate the error variance with MSE. What should we do in order to test our hypothesis? We obviously cannot perform the test for interaction because we have no error term.

If you are willing to assume, and if it is true that there is no interaction, then you can use the interaction as your F-test denominator for testing the main effects. It is a fairly safe and conservative thing to do. If it is not true then the MSab will tend to be larger than it should be, so the F-test is conservative. You're not likely to reject a main effect if it is not true. You won't make a Type I error, however you could more likely make a Type II error.

Extension to a 3 Factor Model Section

The factorial model with three factors can be written as:

\(Y_{ijk} = \mu + \alpha_i + \beta_j + \gamma_k + (\alpha \beta)_{ij} + (\alpha \gamma)_{ik} + (\beta \gamma)_{jk} + (\alpha \beta \gamma)_{ijk} + e_{ijkl}\)

where \(i = 1, \dots , a, j = 1 , \dots , b, k = 1 , \dots , c, l = 1 , \dots , n\)

We extend the model in the same way. Our analysis of variance has three main effects, three two-way interactions, a three-way interaction and error. If this were conducted as a Completely Randomized Design experiment, each of the a × b × c treatment combinations would be randomly assigned to n of the experimental units.

Sample Size Determination [Section 5.3.5] Section

We first consider the two-factor case where N = a × b × n, (n = the number of replicates per cell). The non-centrality parameter for calculating sample size for the A factor is:

\(\phi^2 = ( nb \times D^{2}) / ( 2a \times \sigma^2)\)

where D is the difference between the maximum of \(\bar{\mu_{i.}}\) and the minimum of \(\bar{\mu_{i.}}\), and where b is the number of observations in each level of factor A.

Actually, at the beginning of our design process, we should decide how many observations we should take, if we want to find a difference of D, between the maximum and the minimum of the true means for the factor A. There is a similar equation for factor B.

\(\phi^{2} = ( na \times D^{2} ) / ( 2b \times \sigma^{2})\)

where na is the number of observations in each level of factor B.

In the two factor case, this is just an extension of what we did in the one-factor case. But now we have the marginal means benefiting from a number of observations per cell and the number of levels of the other factor. In this case, we have n observations per cell, and we have b cells. So, we have nb observations.