In Multi-factor experiments combinations of treatments are applied to experimental units. Applying this to our greenhouse example, we have worked with a single factor, fertilizer, and examined differences among the fertilizer types. However, the researcher is also interested in the growth of different species of plant. Species is a second factor, making this a multifactor experiment. But... those of you with green thumbs say sometimes different fertilizers are more effective on different species of plants! Now we come to the idea that in a factorial design, each level of every treatment is combined with each level of all other treatments. For example, if there are 4 levels of Factor A, and 3 levels of Factor B, then to complete one replication of the experiment we will need 12 experimental units to accommodate the 12 treatment combinations. With the addition of crossed factors the number of experimental units increases very quickly and so tough decisions have to be made regarding the number of treatments and the number of levels of each treatment.

For a two-factor factorial, say Factor A at two levels, and Factor B at two levels, we have 4 treatment combinations: A1B1, A1B2, A2B1, and A2B2 that would be applied to the experimental units. If we had 5 replications, we would have a total of 20 observations and construct the following ANOVA table

Note that the d.f. for each factor by itself is still group observations -1. The treatment combination d.f. are the product of the d.f. for the two factors (which will become more intuitive as we think about what the treatment combination term is):

Source |
d.f. |
|||

FactorA | (a - 1) = 1 |
|||

FactorB | (b - 1) = 1 |
|||

FactorA × FactorB | (a - 1)(b - 1) = 1 |
|||

Error | 19 - 3 = 16 | |||

Total | N - 1 = 19 |

So what is going on? If we look at our single-factor model,

\(Y_{ij}=\mu+\tau_i+\epsilon_{ij}\)

we are now effectively replacing the \(\tau_i\) with \(\alpha_i+\beta_j+(\alpha\beta)_{ij}\),

where \(\alpha_i\) is the main effect of factor A, \(\beta_j\) is the main effect of factor B, and \( (\alpha \beta)_{ij} \) is an interaction effect.

This reflects the following partitioning of treatment deviations from the grand mean:

\(\underbrace{\bar{Y}_{ij.}-\bar{Y}_{...}}_{\substack{\text{Deviation estimated treatment} \\ \text{mean around overall mean}}} = \underbrace{\bar{Y}_{i..}-\bar{Y}_{..}}_{\text{A main effect}} + \underbrace{\bar{Y}_{.j.}-\bar{Y}_{...}}_{\text{B main effect}} +\underbrace{\bar{Y}_{ij.}-\bar{Y}_{i..}-\bar{Y}_{.j.}+\bar{Y}_{...}}_{\text{AB interaction effect}}\)

And so we have the following statistical model:

\(Y_{ijk}=\mu_{..}+ \alpha_i+\beta_j+(\alpha\beta)_{ij} +\epsilon_{ijk}\)

The main effects for Factor A and Factor B are straightforward to interpret, but what is an interaction?

An interaction can be defined as **the failure of the response to one factor to be the same at different levels of another factor**. Notice that the interaction term in the model is multiplicative \( (\alpha\beta)_{ij} \), and as a result, may have a large and important impact on the response variable. When we include interactions in our model statement in SAS (or in the Model Box in Minitab), we simply use multiplication of the main effects to specify the interaction (now does the d.f. term become more apparent?).

Interactions go by different names in various fields. In medicine, for example, doctors always ask what medication you are on before prescribing a new medication. They do this out of a concern for interaction effects of either interference (a canceling effect) or *synergism *(a compounding effect).

Graphically, interactions can be seen as non-parallel lines connecting means when we are working with the simple two-factor factorial with 2 levels of each main effect (adapted from Zar, H. *Biostatistical Analysis*, 5th Ed., 1999). Remember interactions are referring to the failure of a response variable to one factor to be the same at different levels of another factor. So when lines are parallel the response is the same. In the plots below you will see parallel lines as a consistent feature in all of the plots with no interaction. In plots depicting interactions, you notice that the lines cross (or would cross if the lines kept going).

In graph 1 there is no effect of Factor A, a small effect of Factor B (and if there were no effect of Factor B the two lines would coincide), and no interaction between Factor A and Factor B

Graph 2 shows a large effect of Factor A small effect of Factor B, and no interaction

No effect of Factor A, larger effect of Factor B, and no interaction in graph 3

In graph 4 there is a large effect of Factor A, a large effect of Factor B and no interaction.

This is evidence of no effect of Factor A, no effect of Factor B but an interaction between A and B in graph 5.

Large effect of Factor A, no effect of Factor B with a slight interaction in graph 6.

No effect of Factor A, a large effect of Factor B, with a very large interaction in graph 7.

An effect of Factor A, a large effect of Factor B with a large interaction in graph 8.

Now, with multiple factors AND interactions, we need to consider multiple hypotheses. Basically, each term in the model can have a corresponding hypothesis. The hypotheses to be tested are:

Main Effect of Factor A:

\(H_0 \colon \mu_{1.} =\mu_{2.}= \cdots =\mu_{a.}\)

\(H_A \colon \text{ not all }\mu_{i.} \text{ are equal }\)

Main Effect of Factor B:

\(H_0 \colon \mu_{.1} =\mu_{.2}= \cdots =\mu_{.b}\)

\(H_A \colon \text{ not all }\mu_{.j} \text{ are equal }\)

A × B Interaction:

\(H_0 \colon \text{ there is no interaction }\)

\(H_A \colon \text{ an interaction exists }\)

When testing these hypotheses, the important thing to remember is that we have to evaluate the **significance of the interaction as our first step** in looking at the output. If the interaction is significant, we can’t do much about interpreting the main effects. We will see this clearly as we consider extending out greenhouse example to include a second (crossed) factor.