4.7 - Incomplete Block Designs

4.7 - Incomplete Block Designs

In using incomplete block designs we will use the notation t = # of treatments. We define the block size as k. And, as you will see, in incomplete block designs k will be less than t. You cannot assign all of the treatments in each block. In short,

t = # of treatments,
k = block size,
b = # of blocks,
\(r_i\) = # of replicates for treatment i, in the entire design.

Remember that an equal number of replications is the best way to be sure that you have minimum variance if you're looking at all possible pairwise comparisons. If \(r_i = r\) for all treatments, the total number of observations in the experiment is N where:

\(N = t(r) = b(k)\)

The incidence matrix which defines the design of the experiment, gives the number of observations say \(n_{ij}\) for the \(i^{th}\) treatment in the \(j^{th}\) block. This is what it might look like here:

12...b12t...blockstreatmentsnij

Here we have treatments 1, 2, up to t and the blocks 1, 2, up to b. For a complete block design, we would have each treatment occurring one time within each block, so all entries in this matrix would be 1's. For an incomplete block design, the incidence matrix would be 0's and 1's simply indicating whether or not that treatment occurs in that block.

Example 1

The example that we will look at is Table 4.22 (4.21 in 7th ed). Here is the incidence matrix for this example:

112342341011111001111101

Here we have t = 4, b = 4, (four rows and four columns) and k = 3 ( so at each block we can only put three of the four treatments leaving one treatment out of each block). So, in this case, the row sums (\(r_i\) ) and the columns sums, k, are all equal to 3.

11234234101111100111110133333333

In general, we are faced with a situation where the number of treatments is specified, and the block size, or number of experimental units per block (k) is given. This is usually a constraint given from the experimental situation. And then, the researcher must decide how many blocks are needed to run and how many replicates that provides in order to achieve the precision or the power that you want for the test.

Example 2

Here is another example of an incidence matrix for allocating treatments and replicates in an incomplete block design. Let's take an example where k = 2, still t = 4, and b = 4. That gives us a case r = 2. In This case we could design our incidence matrix so that it might look like this:

11234234101001011010010122222222

This example has two observations per block so k = 2 in each case and for all treatments r = 2.

Balanced Incomplete Block Design (BIBD)

A BIBD is an incomplete block design where all pairs of treatments occur together within a block an equal number of times ( \(\lambda\) ). In general, we will specify \(\lambda_{ii^\prime}\) as the number of times treatment \(i\) occurs with \(i^\prime\), in a block.

Let's look at previous cases. How many times does treatment one and two occur together in this first example design?

11234234101111100111110133333333blockstreatments

It occurs together in block 2 and then again in block 4 (highlighted in light blue). So, \(\lambda_{12} = 2\). If we look at treatment one and three, this occurs together in block one and in block two therefore \(\lambda_{13} = 2\). In this design, you can look at all possible pairs. Let's look at 1 and 4 - they occur together twice, 2 and 3 occur together twice, 2 and 4 twice, and 3 and 4 occur together twice. For this design \(\lambda_{ii^\prime} = 2\) for all \(ii^\prime\) treatment pairs defining the concept of balance in this incomplete block design.

If the number of times treatments occur together within a block is equal across the design for all pairs of treatments then we call this a balanced incomplete block design (BIBD).

Now look at the incidence matrix for the second example.

11234234101001011010010122222222

We can see that:

\(\lambda_{12}\) occurs together 0 times.

\(\lambda_{13}\) occurs together 2 times.

\(\lambda_{14}\) occurs together 0 times.

\(\lambda_{23}\) occurs together 0 times.

\(\lambda_{24}\) occurs together 2 times.

\(\lambda_{34}\) occurs together to 0 times.

Here we have two pairs occurring together 2 times and the other four pairs occurring together 0 times. Therefore, this is not a balanced incomplete block design (BIBD).

What else is there about BIBD?

We can define \(\lambda\) in terms of our design parameters when we have equal block size k, and equal replication \(r_i = r\). For a given set of t, k, and r we define \(\lambda\) as:

\(\lambda = r(k-1) / t-1\)

So, for the first example that we looked at earlier - let's plug in the values and calculate \(\lambda\):

\(\lambda = 3 (3 - 1) / (4 -1) = 2\)

Here is the key: when \(\lambda\) is equal to an integer number it tells us that a balanced incomplete block design exists. Let's look at the second example and use the formula and plug in the values for this second example. So, for \(t = 4\), \(k = 2\), \(r = 2\) and \(b = 4\), we have:

\(\lambda = 2 (2 - 1) / (4 - 1) = 0.666\)

Since \(\lambda\) is not an integer there does not exist a balanced incomplete block design for this experiment. We would either need more replicates or a larger block size. Seeing as how the block size in this case is fixed, we can achieve a balanced complete block design by adding more replicates so that \(\lambda\) equals at least 1. It needs to be a whole number in order for the design to be balanced.

We will talk about partially balanced designs later. But in thinking about this case we note that a balanced design doesn't exist so what would be the best partially balanced design? That would be a question that you would ask if you could only afford four blocks and the block size is two. Given this situation, is the design in Example 2 the best design we can construct? The best partially balanced design is where \(\lambda_{ii^\prime}\) should be the nearest integers to the \(\lambda\) that we calculated. In our case each \(\lambda_{ii^\prime}\) should be either 0 or 1, the integers nearest 0.667. This example is not as close to balanced as it could be. In fact, it is not even a connected design where you can go from any treatment to any other treatment within a block. More about this later...

How do you construct a BIBD?

In some situations, it is easy to construct the best IBD, however, for other cases it can be quite difficult and we will look them up in a reference.

Let's say that we want six blocks, we still want 4 treatments and our block size is still 2. Calculate \(\lambda = r(k - 1) / (t - 1) = 1\). We want to create all possible pairs of treatments because lambda is equal to one. We do this by looking at all possible combinations of four treatments taking two at a time. We could set up the incidence matrix for the design or we could represent it like this - entries in the table are treatment labels: {1, 2, 3, 4}.

11122323434412345612

However, this method of constructing a BIBD using all possible combinations, does not always work as we now demonstrate. If the number of combinations is too large then you need to find a subset - - not always easy to do. However, sometimes you can use Latin Squares to construct a BIBD. As an example, let's take any 3 columns from a 4 × 4 Latin Square design. This subset of columns from the whole Latin Square creates a BIBD. However, not every subset of a Latin Square is a BIBD.

Let's look at an example. In this example we have t = 7, b = 7, and k = 3. This means that r = 3 = (bk) / t . Here is the 7 × 7 Latin square :

ABCDEFGBCDEFGACDEFGABDEFGABCEFGABCDFGABCDEGABCDEF

We want to select (k = 3) three columns out of this design where each treatment occurs once with every other treatment because \(\lambda = 3(3 - 1) / (7 - 1) = 1\).

We could select the first three columns - let's see if this will work. Click the animation below to see whether using the first three columns would give us combinations of treatments where treatment pairs are not repeated.

Since the first three columns contain some pairs more than once, let's try columns 1, 2, and now we need a third...how about the fourth column. If you look at all possible combinations in each row, each treatment pair occurs only one time.

What if we could afford a block size of 4 instead of 3? Here t = 7, b = 7, k = 4, then r = 4. We calculate \(\lambda = r(k - 1) / (t - 1) = 2\) so a BIBD does exist. For this design with a block size of 4 we can select 4 columns (or rows) from a Latin square. Let's look at columns again... can you select the correct 4?

Now consider the case with 8 treatments. The number of possible combinations of 8 treatments taking 4 at a time is 70. Thus with 70 sets of 4 from which you have to choose 14 blocks - - wow, this is a big job! At this point, we should simply look at an appropriate reference. Here is a handout - a catalog that will help you with this selection process - taken from Cochran & Cox, Experimental Design, p. 469-482.

Analysis of BIBD's

When we have missing data, it affects the average of the remaining treatments in a row, i.e., when complete data does not exist for each row - this affects the means. When we have complete data the block effect and the column effects both drop out of the analysis since they are orthogonal. With missing data or IBDs that are not orthogonal, even BIBD where orthogonality does not exist, the analysis requires us to use GLM which codes the data like we did previously. The GLM fits first the block and then the treatment.

The sequential sums of squares (Seq SS) for block is not the same as the Adj SS.

We have the following:

Seq SS

\(SS(\beta | \mu) 55.0\)

\(SS(\tau | \mu, \beta) = 22.50\)

Adj SS

\(SS(\beta | \mu, \tau) = 66.08\)

\(SS(\tau | \mu, \beta) = 22.75\)

Switch them around...now first fit treatments and then the blocks.

Seq SS

\(SS(\tau | \mu) = 11.67\)

\(SS(\beta | \mu, \tau) = 66.08\)

Adj SS

\(SS(\tau | \mu, \beta) = 22.75\)

\(SS(\beta | \mu, \tau_i) = 66.08\)

The 'least squares means' come from the fitted model. Regardless of the pattern of missing data or the design we can conceptually think of our design represented by the model:

\(Y_{ij}= \mu + +\beta _{i}+\tau _{j}+e_{ij}\)

\(i = 1, \dots , b\), \(j = 1, \dots , t\)

You can obtain the 'least squares means' from the estimated parameters from the least squares fit of the model.

Optional Section

See the discussion in the text for Recovery of Interblock Information, p. 154. This refers to a procedure which allows us to extract additional information from a BIBD when the blocks are a random effect. Optionally you can read this section. We illustrate the analysis by the use of the software, PROC Mixed in SAS (L03_sas_Ex_4_5.sas):

data; input blk trt Y; cards;
1 1 73
1 3 73
1 4 75
2 1 74
2 2 75
2 3 75
3 2 67
3 3 68
3 4 72
4 1 71
4 2 72
4 4 75
;;;;
/*This data is from Example 4-5 in Montgomery, Design and Analysis of experiments, 6th edition, */
/* Wiley, 2005, pages 147-154.  This demonstrates the recovery of interblock information when   */
/* the blocks are considered random.  */
proc glm; class trt blk;
model Y = blk trt;
lsmeans trt/ e stderr pdiff;

proc mixed; class trt blk;
model Y = trt;
random blk;
lsmeans trt/ e pdiff;
/* The next 4 estimate statements calculate the treatment effects from the solution*/
estimate "trt effect 1" trt +.75 -.25 -.25 -.25/e; 
estimate "trt effect 2" trt -.25 +.75 -.25 -.25/e;
estimate "trt effect 3" trt -.25 -.25 +.75 -.25/e;
estimate "trt effect 4" trt -.25 -.25 -.25 +.75/e;
/* The next 3 contrast statements show one set of orthogonal contrasts*/
contrast "trt1 vs trt2-4" trt 3 -1 -1 -1; 
contrast "trt2 vs trt3-4" trt 0 2 -1 -1 ; 
contrast "trt3 vs trt4" trt 0 0 1 -1 ;
run;
 The SAS System        12:49 Friday, August 15, 2008   1

                                       The GLM Procedure

                                    Class Level Information

                                Class         Levels    Values

                                trt                4    1 2 3 4

                                blk                4    1 2 3 4


                            Number of Observations Read          12
                            Number of Observations Used          12
                                         The SAS System        12:49 Friday, August 15, 2008   2

                                       The GLM Procedure

Dependent Variable: Y

                                              Sum of
      Source                      DF         Squares     Mean Square    F Value    Pr > F

      Model                        6     77.75000000     12.95833333      19.94    0.0024

      Error                        5      3.25000000      0.65000000

      Corrected Total             11     81.00000000


                       R-Square     Coeff Var      Root MSE        Y Mean

                       0.959877      1.112036      0.806226      72.50000


      Source                      DF       Type I SS     Mean Square    F Value    Pr > F

      blk                          3     55.00000000     18.33333333      28.21    0.0015
      trt                          3     22.75000000      7.58333333      11.67    0.0107


      Source                      DF     Type III SS     Mean Square    F Value    Pr > F

      blk                          3     66.08333333     22.02777778      33.89    0.0010
      trt                          3     22.75000000      7.58333333      11.67    0.0107
                                         The SAS System        12:49 Friday, August 15, 2008   3

                                       The GLM Procedure
                                      Least Squares Means

                            Coefficients for trt Least Square Means

                                                     trt Level
              Effect                                    1       2       3       4

              Intercept                                 1       1       1       1
              blk       1                            0.25    0.25    0.25    0.25
              blk       2                            0.25    0.25    0.25    0.25
              blk       3                            0.25    0.25    0.25    0.25
              blk       4                            0.25    0.25    0.25    0.25
              trt       1                               1       0       0       0
              trt       2                               0       1       0       0
              trt       3                               0       0       1       0
              trt       4                               0       0       0       1


                                             Standard                  LSMEAN
                  trt        Y LSMEAN           Error    Pr > |t|      Number

                  1        71.3750000       0.4868051      <.0001           1
                  2        71.6250000       0.4868051      <.0001           2
                  3        72.0000000       0.4868051      <.0001           3
                  4        75.0000000       0.4868051      <.0001           4


                               Least Squares Means for effect trt
                              Pr > |t| for H0: LSMean(i)=LSMean(j)

                                     Dependent Variable: Y

                  i/j              1             2             3             4

                     1                      0.7349        0.4117        0.0035
                     2        0.7349                      0.6142        0.0047
                     3        0.4117        0.6142                      0.0077
                     4        0.0035        0.0047        0.0077


NOTE: To ensure overall protection level, only probabilities associated with pre-planned
      comparisons should be used.
                                         The SAS System        12:49 Friday, August 15, 2008   4

                                      The Mixed Procedure

                                       Model Information

                     Data Set                     WORK.DATA1
                     Dependent Variable           Y
                     Covariance Structure         Variance Components
                     Estimation Method            REML
                     Residual Variance Method     Profile
                     Fixed Effects SE Method      Model-Based
                     Degrees of Freedom Method    Containment


                                    Class Level Information

                       Class    Levels    Values

                       trt           4    1 2 3 4
                       blk           4    1 2 3 4


                                          Dimensions

                              Covariance Parameters             2
                              Columns in X                      5
                              Columns in Z                      4
                              Subjects                          1
                              Max Obs Per Subject              12


                                    Number of Observations

                          Number of Observations Read              12
                          Number of Observations Used              12
                          Number of Observations Not Used           0


                                       Iteration History

                  Iteration    Evaluations    -2 Res Log Like       Criterion

                          0              1        44.37333968
                          1              1        34.22046396      0.00000000


                                   Convergence criteria met.


                                         The SAS System        12:49 Friday, August 15, 2008   5

                                      The Mixed Procedure

                                     Covariance Parameter
                                           Estimates

                                     Cov Parm     Estimate

                                     blk            8.0167
                                     Residual       0.6500


                                        Fit Statistics

                             -2 Res Log Likelihood            34.2
                             AIC (smaller is better)          38.2
                             AICC (smaller is better)         40.6
                             BIC (smaller is better)          37.0


                                 Type 3 Tests of Fixed Effects

                                       Num     Den
                         Effect         DF      DF    F Value    Pr > F

                         trt             3       5      11.41    0.0113


                                        Coefficients for
                                          trt effect 1

                                   Effect       trt      Row1

                                   Intercept
                                   trt          1        0.75
                                   trt          2       -0.25
                                   trt          3       -0.25
                                   trt          4       -0.25


                                        Coefficients for
                                          trt effect 2

                                   Effect       trt      Row1

                                   Intercept
                                   trt          1       -0.25
                                   trt          2        0.75
                                   trt          3       -0.25
                                   trt          4       -0.25
                                         The SAS System        12:49 Friday, August 15, 2008   6

                                      The Mixed Procedure

                                        Coefficients for
                                          trt effect 3

                                   Effect       trt      Row1

                                   Intercept
                                   trt          1       -0.25
                                   trt          2       -0.25
                                   trt          3        0.75
                                   trt          4       -0.25


                                        Coefficients for
                                          trt effect 4

                                   Effect       trt      Row1

                                   Intercept
                                   trt          1       -0.25
                                   trt          2       -0.25
                                   trt          3       -0.25
                                   trt          4        0.75


                                           Estimates

                                          Standard
              Label           Estimate       Error      DF    t Value    Pr > |t|

              trt effect 1     -1.0869      0.4269       5      -2.55      0.0515
              trt effect 2     -0.8836      0.4269       5      -2.07      0.0932
              trt effect 3     -0.5000      0.4269       5      -1.17      0.2942
              trt effect 4      2.4705      0.4269       5       5.79      0.0022


                                           Contrasts

                                         Num     Den
                      Label               DF      DF    F Value    Pr > F

                      trt1 vs trt2-4       1       5       6.48    0.0515
                      trt2 vs trt3-4       1       5       9.58    0.0270
                      trt3 vs trt4         1       5      18.16    0.0080


                                         The SAS System        12:49 Friday, August 15, 2008   7

                                      The Mixed Procedure

                            Coefficients for trt Least Squares Means

                    Effect       trt      Row1      Row2      Row3      Row4

                    Intercept                1         1         1         1
                    trt          1           1
                    trt          2                     1
                    trt          3                               1
                    trt          4                                         1


                                      Least Squares Means

                                           Standard
              Effect    trt    Estimate       Error      DF    t Value    Pr > |t|

              trt       1       71.4131      1.4968       5      47.71      <.0001
              trt       2       71.6164      1.4968       5      47.84      <.0001
              trt       3       72.0000      1.4968       5      48.10      <.0001
              trt       4       74.9705      1.4968       5      50.09      <.0001


                               Differences of Least Squares Means

                                               Standard
          Effect    trt    _trt    Estimate       Error      DF    t Value    Pr > |t|

          trt       1      2        -0.2033      0.6971       5      -0.29      0.7823
          trt       1      3        -0.5869      0.6971       5      -0.84      0.4382
          trt       1      4        -3.5574      0.6971       5      -5.10      0.0038
          trt       2      3        -0.3836      0.6971       5      -0.55      0.6058
          trt       2      4        -3.3541      0.6971       5      -4.81      0.0048
          trt       3      4        -2.9705      0.6971       5      -4.26      0.0080

Note that the least squares means for treatments when using PROC Mixed, correspond to the combined intra- and inter-block estimates of the treatment effects.

Random Effect Factor

So far we have discussed experimental designs with fixed factors, that is, the levels of the factors are fixed and constrained to some specific values. However, this is often not the case. In some cases, the levels of the factors are selected at random from a larger population. In this case, the inference made on the significance of the factor can be extended to the whole population but the factor effects are treated as contributions to variance.

Minitab’s General Linear Command handles random factors appropriately as long as you are careful to select which factors are fixed and which are random.


Legend
[1]Link
Has Tooltip/Popover
 Toggleable Visibility