4.3 - The Latin Square Design

Latin Square Designs are probably not used as much as they should be - they are very efficient designs. Latin square designs allow for two blocking factors. In other words, these designs are used to simultaneously control (or eliminate) two sources of nuisance variability. For instance, if you had a plot of land the fertility of this land might change in both directions, North -- South and East -- West due to soil or moisture gradients. So, both rows and columns can be used as blocking factors. However, you can use Latin squares in lots of other settings. As we shall see, Latin squares can be used as much as the RCBD in industrial experimentation as well as other experiments.

Whenever, you have more than one blocking factor a Latin square design will allow you to remove the variation for these two sources from the error variation. So, consider we had a plot of land, we might have blocked it in columns and rows, i.e. each row is a level of the row factor, and each column is a level of the column factor. We can remove the variation from our measured response in both directions if we consider both rows and columns as factors in our design.

The Latin Square Design gets its name from the fact that we can write it as a square with Latin letters to correspond to the treatments. The treatment factor levels are the Latin letters in the Latin square design. The number of rows and columns has to correspond to the number of treatment levels. So, if we have four treatments then we would need to have four rows and four columns in order to create a Latin square. This gives us a design where we have each of the treatments and in each row and in each column.

columnsrowsABCDBBBCCCDDDEach treatment occurs in every column and rowAAA

This is just one of many 4×4 squares that you could create. In fact, you can make any size square you want, for any number of treatments - it just needs to have the following property associated with it - that each treatment occurs only once in each row and once in each column.

Consider another example in an industrial setting: the rows are the batch of raw material, the columns are the operator of the equipment, and the treatments (A, B, C and D) are an industrial process or protocol for producing a particular product.

What is the model? We let:

\(y_{ijk} = \mu + \rho_i + \beta_j + \tau_k + e_{ijk}\)

i = 1, ... , t
j = 1, ... , t
[k = 1, ... , t] where - k = d(i, j) and the total number of observations

\(N = t^2\) (the number of rows times the number of columns) and t is the number of treatments.

Note that a Latin Square is an incomplete design, which means that it does not include observations for all possible combinations of i, j and k. This is why we use notation \(k = d(i, j)\). Once we know the row and column of the design, then the treatment is specified. In other words, if we know i and j, then k is specified by the Latin Square design.

This property has an impact on how we calculate means and sums of squares, and for this reason, we can not use the balanced ANOVA command in Minitab even though it looks perfectly balanced. We will see later that although it has the property of orthogonality, you still cannot use the balanced ANOVA command in Minitab because it is not complete.

An assumption that we make when using a Latin square design is that the three factors (treatments, and two nuisance factors) do not interact. If this assumption is violated, the Latin Square design error term will be inflated.

The randomization procedure for assigning treatments that you would like to use when you actually apply a Latin Square, is somewhat restricted to preserve the structure of the Latin Square. The ideal randomization would be to select a square from the set of all possible Latin squares of the specified size. However, a more practical randomization scheme would be to select a standardized Latin square at random (these are tabulated) and then:

  1. randomly permute the columns,
  2. randomly permute the rows, and then
  3. assign the treatments to the Latin letters in a random fashion.

Consider a factory setting where you are producing a product with 4 operators and 4 machines. We call the columns the operators and the rows the machines. Then you can randomly assign the specific operators to a row and the specific machines to a column. The treatment is one of four protocols for producing the product and our interest is in the average time needed to produce each product. If both the machine and the operator have an effect on the time to produce, then by using a Latin Square Design this variation due to machine or operators will be effectively removed from the analysis.

The following table gives the degrees of freedom for the terms in the model.

AOV
df
df for the example
Rows
t-1
3
Cols
t-1
3
Treatments
t-1
3
Error
(t-1)(t-2)
6
Total
(t2 - 1)
15

A Latin Square design is actually easy to analyze. Because of the restricted layout, one observation per treatment in each row and column, the model is orthogonal.

If the row, \(\rho_i\), and column, \(\beta_j\), effects are random with expectations zero, the expected value of \(Y_{ijk}\) is \(\mu + \tau_k\). In other words, the treatment effects and treatment means are orthogonal to the row and column effects. We can also write the sums of squares, as seen in Table 4.10 in the text.

We can test for row and column effects, but our focus of interest in a Latin square design is on the treatments. Just as in RCBD, the row and column factors are included to reduce the error variation but are not typically of interest. And, depending on how we've conducted the experiment they often haven't been randomized in a way that allows us to make any reliable inference from those tests.

Note: if you have missing data then you need to use the general linear model and test the effect of treatment after fitting the model that would account for the row and column effects.

In general, the General Linear Model tests the hypothesis that:

\(H_0 \colon \tau_i = 0\) vs. \(H_A \colon \tau_i \ne 0\)

To test this hypothesis we will look at the F-ratio which is written as:

\(F=\dfrac{MS(\tau_k|\mu,\rho_i,\beta_j)}{MSE(\mu,\rho_i,\beta_j,\tau_k)}\sim F((t-1),(t-1)(t-2))\)

To get this in Minitab you would use GLM and fit the three terms: rows, columns and treatments. The F statistic is based on the adjusted MS for treatment.

The Rocket Propellant Problem – A Latin Square Design

Batches of Raw Material Operators
1 2 3 4 5
1 A = 24 B = 20 C = 19 D = 24 E = 24
2 B = 17 C = 24 D = 30 E = 27 A = 36
3 C = 18 D = 38 E = 26 A = 27 B = 21
4 D = 26 E = 31 A = 26 B = 23 C = 22
5 E = 22 A = 30 B = 20 C = 29 D = 31
Latin Square Design for the Rocket Propellant

Statistical Analysis of the Latin Square Design Section

The statistical (effects) model is:

\(Y_{ijk}=\mu +\rho_i+\beta_j+\tau_k+\varepsilon_{ijk}
\left\{\begin{array}{c}
i=1,2,\ldots,p \\
j=1,2,\ldots,p\\
k=1,2,\ldots,p
\end{array}\right. \)

but \(k = d(i, j)\) shows the dependence of k in the cell i, j on the design layout, and p = t the number of treatment levels.

The statistical analysis (ANOVA) is much like the analysis for the RCBD.

The analysis for the rocket propellant example is presented in Example 4.3.