6.1 - The Simplest Case

The simplest case is \(2^k\) where \(k = 2\). We will define a new notation which is known as Yates notation. We will refer to our factors using the letters A, B, C, D, etc. as arbitrary labels of the factors. In the chemical process case, A is the concentration of the reactant and B is the amount of catalyst, both of which are quantitative. The yield of the process is our response variable.

Since there are two levels of each of two factors, \(2^k\) equals four. Therefore, there are four treatment combinations and the data are given below:

You can see that we have 3 observations at each of \(4 = 2^k\) combinations for \(k = 2\). So we have \(n = 3\) replicates.

A	B	Yates Notation
-	-	(1)
+	-	a
-	+	b
+	+	ab

The table above gives the data with the factors coded for each of the four combinations and below is a plot of the region of experimentation in two dimensions for this case.

The Yates notation used for denoting the factor combinations is as follows:

We use "(1)" to denote that both factors are at the low level, "a" for when A is at its high level and B is at its low level, "b" for when B is at its high level and A is at its low level, and "ab" when both A and B factors are at their high level.

The use of this Yates notation indicates the high level of any factor simply by using the small letter of that level factor. This notation actually is used for two purposes. One is to denote the total sum of the observations at that level. In the case below \(b = 60\) is the sum of the three observations at the level b.

This shortcut notation, using the small letters, shows which level for each of our k factors we are at just by its presence or absence.

We will also connect this to our previous notation for the two-factor treatment design:

\(Y_{ijk} = \mu + \alpha_{i} + \beta_{j} + (\beta \beta)_{ij} + e_{ijk}\)

What is the primary goal of these screening experiments?

The goal is to decide which of these factors is important. After determining which factors are important, then we will typically plan for a secondary experiment where the goal is to decide what level of the factors gives us the optimal response. Thus the screening \(2^k\) experiment is the first stage, generally, of an experimental sequence. In the second stage, one is looking for a response surface or an experiment to find the optimal level of the important factors.

Estimation of Factors Effects (in the Yates tradition) Section

The definition of an effect in the \(2^k\) context is the difference in the means between the high and the low level of a factor. From this notation, A is the difference between the averages of the observations at the high level of A minus the average of the observations at the low level of A.

Therefore, \(A=\bar{y}_{A^+}-\bar{y}_{A^-}\), in the example above:

\(A = 190/6 - 140/6 = 50/6 = 8.33\)

Similarly, \(B=\bar{y}_{B^+}-\bar{y}_{B^-}\), is similar only looking in the other direction. In our example:

\(B = 150/6 - 180/6 = 25 - 30 = -5\)

and finally, \(AB=\dfrac{ab+(1)}{2n}-\dfrac{a+b}{2n}\)

\(AB = [(90 + 80)/6 - (100 + 60)/6] = 10/6 = 1.67\)

Therefore in the Yates notation, we define an effect as the difference in the means between the high and the low levels of a factor whereas in previous models we defined an effect as the coefficients of the model, which are the differences between the marginal mean and the overall mean. To restate this, in terms of A, the A effect is the difference between the means at the high levels of A versus the low levels of A, whereas the coefficient, \(\alpha_i\), in the model is the difference between the marginal mean and the overall mean. So the Yates "effect" is twice the size of the estimated coefficient α_i in the model, which is also usually called the effect of factor A.

The confusion is all in the notation used in the definition.

Let's look at another example in order to reinforce your understanding of the notation for these types of designs. Here is an example in three dimensions, with factors A, B and C. Below is a figure of the factors and levels as well as the table representing this experimental space.

Figure 6-4 The \(2^3\) factorial design

Factor
Run	A	B	C
1	-	-	-
2	+	-	-
3	-	+	-
4	+	+	-
5	-	-	+
6	+	-	+
7	-	+	+
8	+	+	+
(b) The design matrix

In the table you can see the eight points coded by the factor levels +1 and -1. This example has two replicates so n = 2. Notice that the Yates notation is included as the total of the two replicates.

One nice feature of the Yates notation is that every column has an equal number of pluses and minuses so these columns are contrasts of the observations. For instance, take a look at the A column. This column has four pluses and four minuses, therefore, the A effect is a contrast.

This is the principle that gives us all sorts of useful characterizations in these \(2^k\) designs.

In the example above the A, B and C each are defined by a contrast of the data observation totals. Therefore you can define the contrast AB as the product of the A and B contrasts, the contrast AC by the product of the A and C contrasts, and so forth.

Therefore all the two-way and three-way interaction effects are defined by these contrasts. The product of any two gives you the other contrast in that matrix.

From these contrasts we can define the effect of A, B, and C, using these coefficients. The general form of an effect for k factors is:

\(\text{Effect} = (1/2^{(k-1)}n)\) [contrast of the totals]

The sum of the products of the contrast coefficients times the totals will give us an estimate of the effects.

We can also write the variance of the effect using the general form used previously. This would be:

\(\begin{eqnarray}
Variance(Effect)&=&[1/(2^{(k-1)}n)^2] V(contrast),or \nonumber\\
&=&[1/(2^{(k-1)}n)^2] 2^k n \sigma^2 \nonumber\\
&=&\sigma^2 / 2^{(k-2)}n \nonumber
\end{eqnarray}\)

Also, we can write the sum of squares for the effects which looks like:

\(SS(\text{effect}) = (\text{contrast})^2 / 2^{k}n\)

To summarize what we have learned in this lesson thus far, we can write a contrast of the totals which defines an effect, we can estimate the variance for this effect and we can write the sum of squares for an effect. We can do this very simply using Yates notation which historically has been the value of using this notation.