8.2 - Analyzing a Fractional Factorial Design

We discussed designing experiments, but now let's discuss how we would analyze these experiments. We take an example we saw before. The response Y is filtration rate in a chemical pilot plant and the four factors are: A = temperature, B = pressure, C = concentration and D = stirring rate. (Example 2 from Chapter 6, Ex6-2.mwx | Ex6-2.csv)

This experimental design has 16 observations, a \(2^4\) with one complete replicate. This is the example we looked at with one observation per cell when we introduced a normal scores plot.

plot

Our final model ended up with three factors, A, C and D, and two of their interactions, AC and AD. This was based on one complete replicate of this design. What might we have learned if we had done an experiment half this size, N = 8? If we look at the fractional factorial - one half of this design - where we have D = ABC or I = ABCD as the generator - this creates a design with 8 observations.

Fractional Factorial Design

Factors: 4 Base Design: 4, 8 Resolution: IV
Runs: 8 Replicates: 1 Fraction: 1/2
Blocks: 1 Center pts (total): 0  

Design Generators: D = ABC

Alias Structure

I + ABCD

A + BCD

B + ACD

C + ABD

D + ABC

AB + CD

AC + BD

AD + BC

The alias structure is a four letter word, therefore this is a Resolution IV design, A, B, C and D are each aliased with a 3-way interaction, (so we can't estimate them any longer), and the two way interactions are aliased with each other.

If we look at the analysis of this 1/2 fractional factorial design and we put all of the terms in the model, (of course some of these are aliased with each other), and we will look at the normal scores plot. What do we get? (The data are in Ex6_2Half.MTW)

plot

We only get seven effects plotted, since there were eight observations. The overall mean does not show up here. These points are labeled but because there are only seven of them there is no estimate of error. Let's look at another plot that we haven't used that much yet - the Pareto plot. This type of plot looks at the effects and orders them from largest to smallest showing you the relative sizes of the effects. Although we do not know what is significant and what is not significant, this still might be a helpful plot to look at to better understand the data.

Pareto plot

This Pareto plot shows us that the three main effects A, C, and D that were most significant in the full design are still important as well as the two interactions, AD and AC. However, B and AB are clearly not as large. (You can do this using the Stat > DOE > Factorial > Analyze and click on Graph.)

What can we learn from this? Let's try to fit a reduced model from the information that we gleaned from this first step. We will include all the main effects and the AC and AD interactions.

In the analysis, we have four main effects ...

Factorial Fit: Y-Rate versus A, B, C, D

Estimated Effects and Coefficients for Y-Rate (coded units)
Term Effect Coef SE Coef T P
Constant   70.750 0.5000 141.50 0.004
A 19.000 9.500 0.5000 19.00 0.033
B 1.500 0.750 0.5000 1.50 0.374
C 14.000 7.000 0.5000 14.00 0.045
D 16.500 8.250 0.5000 16.50 0.039
A*C -18.500 -9.250 0.5000 -18.50 0.034
A*D 19.000 9.500 0.5000 19.00 0.033
S = 1.41421 R-Sq = 99.93% R-Sq(adj) = 99.54%
Analysis of Variance for Y-Rate (coded units)
Source DF Seq SS Adj SS Adj MS F P
Main Effects 4 1663.00 1663.00 415.750 207.88 0.052
2-Way Interactions 2 1406.50 1406.50 703.250 351.63 0.038
Residual Error 1 2.00 2.00 2.000    
Total 7 3071.50  

... overall they are almost significant, (.052), and the overall two-way interactions, (.038) but we only have one degree of freedom of error - so this makes this a very low-power test. However, this is the price that you would pay with a fractional factorial. If we look above at the individual effects, B as we saw on the plot appears to be not important, we have further evidence that we should drop this from the analysis.

Back to Minitab and let's drop the B term because it doesn't show up as a significant main effect nor as part of any of the interactions.

Factorial Fit: Y-Rate versus A, C, D

Estimated Effects and Coefficients for Y-Rate (coded units)
Term Effect Coef SE Coef T P
Constant   70.750 0.6374 111.0 0.000
A 19.000 9.500 0.6374 14.90 0.004
C 14.000 7.000 0.6374 10.98 0.008
D 16.500 8.250 0.6374 12.94 0.006
A*C -18.500 -9.250 0.6374 -14.51 0.005
A*D 19.000 9.500 0.6374 14.90 0.004
S = 1.80278 R-Sq = 99.79% R-Sq(adj) = 99.26%
Analysis of Variance for Y-Rate (coded units)
Source DF Seq SS Adj SS Adj MS F P
Main Effects 3 1658.50 1658.50 552.833 170.10 0.006
2-Way Interactions 2 1406.50 1406.50 703.250 216.38 0.005
Residual Error 2 6.50 6.50 3.250    
Total 7 3071.50  

Now the overall main effects and 2-way interactions are significant. Residual error still only has 2 degrees of freedom, but this gives us an estimate at least and we can also look at the individual effects.

So, fractional factorials are useful when you hope or expect that not all of the factors are going to be significant. You are screening for factors to drop out of the study. In this example, we started with a \(2^{4 - 1}\) design but when we dropped B we ended up with a \(2^3\) design with 1 observation per cell.

This is a typical scenario, you begin by screening a large number of factors and end up with a smaller set. We still don't know much about the factors and this is still a pretty thin or weak design but it gives you the information that you need to take the next step. You can now do a more complete experiment on fewer factors.