4.2 - Nested Treatment Design

When setting up a multi-factor study, sometimes it is not possible to cross the factor levels. In other words, because of the logistics of the situation, we may not be able to have each level of a treatment be combined with each level of another treatment. Our textbook presents a nice example of this, and we can work through it here. A manufacturing company had three training schools located in different cities, and each school employs different instructors. They want to evaluate the effect of the school (Factor A) and the Instructor (Factor B) on learning achievement (the response variable) for teaching two classes. We can set up a diagram to illustrate the treatment design, using the subscript (i) to identify the schools, and the subscript (j) to indicate the instructors:

Factor A (School)
i
Factor B (instructor)
j
 
1 2 Average
Atlanta   25 14  
    29 11  
  Average \(\bar{Y}_{11.}=27\) \(\bar{Y}_{12.}=12.5\) \(\bar{Y}_{1..}=19.75\)
Chicago   11 22  
    6 18  
  Average \(\bar{Y}_{21.}=8.5\) \(\bar{Y}_{22.}=20\) \(\bar{Y}_{2..}=14.25\)
San Francisco   17 5  
    20 2  
  Average \(\bar{Y}_{31.}=18.5\) \(\bar{Y}_{32.}=3.5\) \(\bar{Y}_{3..}=11.00\)
      Average \(\bar{Y}_{...}=15\)

The figure shows the data obtained, the grand mean, the marginal means (which are the treatment level means) and finally, the cell means. The cell means are the averages of the two classes for the combination of school and instructor.

I think this example drives home the point that the levels of the second factor (Instructor) can’t practically be crossed with the levels of the School factor. An instructor would likely not be flying around to teach in each of the three schools in different cities. Rather, the instructors are unique to each school. Note that the instructors are identified as 1, or 2 within each school. We have to be careful in this situation because a computer software program won’t ‘know’ that Instructor 1 in Atlanta is not the same instructor as Instructor 1 in Chicago, for example. The use of parentheses is what we use to clearly indicate nesting of factor levels. So we use the notation: Instructor(School).

Sc hool ( i ) 1 ( i= 1) 1 ( j= 1) 1 ( k= 1) 2 ( k= 2) 3 ( k= 1) 4 ( k= 2) 5 ( k= 1) 6 ( k= 2) 7 ( k= 1) 8 ( k= 2) 9 ( k= 1) 10 ( k= 2) 11 ( k= 1) 12 ( k= 2) 2 ( j= 2) 3 ( j= 1) 4 ( j= 2) 5 ( j= 1) 6 ( j= 2) 2 ( i= 2) 3 ( i= 3) Ins t r uc t or ( j ) Class (k )

We can partition the deviations as before into the following components:

\(\underbrace{Y_{ijk}-\bar{Y}_{...}}_{\text{Total deviation}} = \underbrace{\bar{Y}_{i..}-\bar{Y}_{...}}_{\text{A main effect}} + \underbrace{\bar{Y}_{ij.}-\bar{Y}_{i..}}_{\text{Specific B effect when} \\ \text{A at the }i^{th} \text{level}} + \underbrace{Y_{ijk}-\bar{Y}_{ij.}}_{\text{Residual}}\)

Our ANOVA table then will look like this:

Source d.f.      
School (a - 1) = 2      
Instructor(School) a(b - 1) = 3      
Error ab(n - 1) = 6      
Total N - 1 = 11      

The statistical model follows as:

\(Y_{ijk}=\mu_{..}+\alpha_{i}+\beta_{j(i)}+\epsilon_{ijk}\)

where:

\(\mu_{..}\) is a constant
\(\alpha_{i}\) are constants subject to the restriction \(\sum\alpha_i=0\)
\(\beta_{j(i)}\) are constants subject to the restriction \(\sum_j\beta_{j(i)}=0\) for all i
\(\epsilon_{ijk}\) are independent N(0, \(\sigma^2\))
i = 1, ... , a; j = 1, ... , b; k = 1, ... , n

We will want to test the following Null Hypotheses:

For FactorA

\(H_0 \colon \mu_{\text{Atlanta}}=\mu_{\text{Chicago}}=\mu_{\text{San Francisco}} \text{ vs. } H_A \colon \text{ Not all equal }\)

For FactorB

When it comes to stating the Null Hypothesis for Factor B, the nested effect, we usually will do this using an alternative notation. We have to this point been stating Null Hypotheses in terms of the means (e.g.\(H_0 \colon \mu_{1}=\mu_{1}= ... =\mu_{k}\)), but we can alternatively state a Null Hypothesis in terms of the parameters for that treatment in the model. For example, for FactorA above we could also state the Null Hypothesis as

\(H_0 \colon \alpha_{\text{Atlanta}}=\alpha_{\text{Chicago}}=\alpha_{\text{San Francisco}}=0\),

or equivalently \(H_0 \colon \text{ all } \alpha_i=0\).

For a nested factor this is the better way to express the Null Hypothesis because we are evaluating the nested factor within the levels of the first factor.

So for the nested factor (FactorB, nested within Schools) we have the Null Hypothesis.

\(H_0 = \text{ all }\beta_{j(i)} =0\) vs. \(H_A=\text{ not all }\beta_{j(i)} =0\)

The F-tests proceed as usual.