5.1 - Factorial or Crossed Treatment Design

In multi-factor experiments, combinations of factor levels are applied to experimental units. To illustrate this idea consider again the single-factor greenhouse experiment discussed in previous lessons. Suppose there is suspicion that the different fertilizer types may be more effective for certain species of plant. To accommodate this, the experiment can be extended to a multi-factor study by including plant species as an additional factor along with fertilizer type. This will allow for assessment as to whether or not the optimal height growth is perhaps attainable by a unique combination of fertilizer type and plant species. A treatment design that enables analysis of treatment combinations is a factorial design. Within this design, responses are observed at each level of all combinations of the factors. In this setting the factors are said to be "crossed"; thus the design is also sometimes referred to as a crossed design.

A factorial design with \(t\) factors can be defined using the notation “\(l_1 \times l_2 \times ... \times l_t\)”, where \(l_i\) is the number of levels in the \(i^{th}\) factor for \(i=1,2,...,t\). For example, a factorial design with 2 factors, A and B, where A has 4 levels and B has 3 levels, would be a \(4 \times 3\) factorial design.

One complete replication of a factorial design with \(t\) factors requires \(l_1 \times l_2 \times ... \times l_t\) experimental units and this quantity is called the replicate size. If \(r\) is the number of complete replicates, then the total number of observations \(N\) is equal to \(r \times (l_1 \times l_2 \times... \times l_t)\). It is easy to see that with the addition of more and more crossed factors, the replicate size will increase rapidly and design modifications may have to be made to make the experiment more manageable (discussed more in later lessons).

In a factorial experiment it is important to differentiate between the lone (or main) effects of a factor on the response and the combined effects of a group of factors on the response. The main effect of factor A is the effect of A on the response ignoring the effect of all other factors. The main effect of a given factor is equivalent to the factor effect associated with the single-factor experiment using only that particular factor. The combined effect of a specific combination of \(t\) different factors is called the interaction effect (more details later). Typically, the interaction effect of most interest is the two-way interaction effect between only two of the \(t\) possible factors. Two-way interactions are typically denoted by the product of the two letters assigned to the two factors. For example, in a factorial design with 3 factors A, B, C, the two-way interaction effects are denoted \(A\times B\), \(A\times C\), and \(B\times C\) (or just \(AB\), \(AC\), and \(BC\)). Likewise, the three-way interaction effect of these 3 factors is denoted by \(A\times B\times C\).

Let us now examine how the degrees of freedom (df) values of a single-factor ANOVA can be extended to the ANOVA of a two-factor factorial design. Note that the interaction effects are additional terms that need to be included in a multi-factor ANOVA, but the ANOVA rules studied in Lesson 2 for single-factor situations still apply for the main effect of each factor. If the two factors of the design are denoted by A and B with \(a\) and \(b\) as their number of levels respectively, then the df values of the two main effects are \((a-1)\) and \((b-1).\) The df value for the two-way interaction effect is \((a-1)(b-1)\), the product of df values for A and B. The ANOVA table below gives the layout of the df values for a \(2\times 2\) factorial design with 5 complete replications. Note that in this experiment, \(r\) equals 5, and \(N\) is equal to 20.

Source	d.f.
Factor A	(a - 1) = 1
Factor B	(b - 1) = 1
Factor A × Factor B	(a - 1)(b - 1) = 1
Error	19 - 3 = 16
Total	\(N - 1 = (r a b) - 1 = 19\)

If in the single-factor model,

\(Y_{ij}=\mu+\tau_i+\epsilon_{ij}\)

\(\tau_i\) is effectively replaced with \(\alpha_i+\beta_j+(\alpha\beta)_{ij}\), then the resulting equation shown below will represent the model equation of a two-factor factorial design.

\(Y_{ijk}=\mu+ \alpha_i+\beta_j+(\alpha\beta)_{ij} +\epsilon_{ijk}\)

where \( \alpha_i\) is the main effect of factor A, \(\beta_j\) is the main effect of factor B, and \((\alpha\beta)_{ij}\) is the interaction effect \((i=1,2,...a, j=1,2,...,b, k=1,2,...,r)\).

This reflects the following partitioning of treatment deviations from the grand mean:

\(\underbrace{\bar{Y}_{ij.}-\bar{Y}_{...}}_{\substack{\text{Deviation of estimated treatment mean} \\ \text{ around overall mean}}} = \underbrace{\bar{Y}_{i..}-\bar{Y}_{...}}_{\text{A main effect}} + \underbrace{\bar{Y}_{.j.}-\bar{Y}_{...}}_{\text{B main effect}} +\underbrace{\bar{Y}_{ij.}-\bar{Y}_{i..}-\bar{Y}_{.j.}+\bar{Y}_{...}}_{\text{AB interaction effect}}\)

The main effects for Factor A and Factor B are straightforward to interpret, but what exactly is an interaction effect? Delving in further, an interaction can be defined as the difference in the response to one factor at various levels of another factor. Notice that \((\alpha\beta)_{ij}\), the interaction term in the model, is multiplicative, and as a result may have a large impact on the response variable. Interactions go by different names in various fields. In medicine for example, physicians commonly ask about current medications before prescribing a new medication. They do this out of a concern for interaction effects of either interference (a canceling effect) or synergism (a compounding effect).

Graphically, in a two factor factorial design with each factor having 2 levels, the interaction can be represented by two non-parallel lines connecting means (adapted from Zar, H. Biostatistical Analysis, 5th Ed., 1999). This is because the interaction reflects the difference in response between the two different levels of one factor for both levels of the other factor. So, if there is no interaction, then this difference in response will be the same, which will result in two parallel lines graphically. Examples of several interaction plots can be seen below. Notice, parallel lines are a consistent feature in all settings with no interaction, whereas in plots depicting interaction, the lines do cross (or would cross if the lines kept going).

Graph 1

In graph 1 there is no effect of Factor A, a small effect of Factor B (and if there were no effect of Factor B the two lines would coincide), and no interaction between Factor A and Factor B.

Graph 2

Graph 2 shows a large effect of Factor A small effect of Factor B, and no interaction.

Graph 3

No effect of Factor A, larger effect of Factor B, and no interaction in graph 3.

Graph 4

In graph 4 there is a large effect of Factor A, a large effect of Factor B , and no interaction.

Graph 5

This is evidence of no effect of Factor A, no effect of Factor B but an interaction between A and B in graph 5.

Graph 6

Large effect of Factor A, no effect of Factor B with a slight interaction in graph 6.

Graph 7

No effect of Factor A, a large effect of Factor B, with a very large interaction in graph 7.

Graph 8

A small effect of Factor A, a large effect of Factor B with a large interaction in graph 8.

In the presence of multiple factors with their interactions multiple hypotheses can be tested. For a two-factor factorial design, those hypotheses are the following.

Main effect of Factor A:

\(H_0 \colon \alpha_{1} =\alpha_{2}= \cdots =\alpha_{a} = 0\)
\(H_A \colon \text{ not all }\alpha_{i} \text{ are equal to zero}\)

Main effect of Factor B:

\(H_0 \colon \beta_{1} =\beta_{2}= \cdots =\beta_{b} = 0\)
\(H_A \colon \text{ not all }\beta_{j} \text{ are equal to zero}\)

A × B interaction effect:

\(H_0 \colon \text{ there is no interaction }\)
\(H_A \colon \text{ an interaction exists }\)

When testing these hypotheses, it is important to test for the significance of the interaction effect first. If the interaction is significant, the main effects are of no consequence - rather the differences among different factor level combinations should be looked into. The greenhouse example, extended to include a second (crossed) factor will illustrate the steps.