34.4 - Creating Random Assignments

We now turn our focus from randomly sampling a subset of observations from a data set to that of generating a random assignment of treatments to experimental units in a randomized, controlled experiment. The good news is that the techniques used to sample without replacement can easily be extended to generate such random assignment plans.

It's probably a good time to remind you of the existence of the PLAN procedure. As I mentioned earlier, due to time constraints of the course and the complexity of the PLAN procedure, we will not use it to accomplish any of our random assignments. You should be aware, however, of its existence should you want to explore it on your own in the future.

Example 34.15 Section

Suppose we are interested in conducting an experiment so that we can compare the effects of two drugs (A and B) and one placebo on headache pain. We have 30 subjects enrolled in our study but need to determine a plan for randomly assigning 10 of the subjects to treatment A, 10 of the subjects to treatment B, and 10 of the subjects to the placebo. The following program does just that for us. That is, it creates a random assignment for 30 subjects in a completely randomized design with one factor having 3 levels:

DATA exper1;
DO Unit = 1 to 30;
    OUTPUT;
END;
RUN;
DATA random1;
        set exper1;
        random=ranuni(27407349);
RUN;
PROC SORT data=random1;
by random;
RUN;
PROC FORMAT;
        value trtfmt 1 = 'Placebo'
        			2 = 'Drug A'
        			3 = 'Drug B';
        RUN;
DATA random1;
        set random1;
        	if _N_ le 10               then group=1;
        else if _N_ gt 10 and _N_ le 20 then group=2;
        else if _N_ gt 20               then group=3;
        format group trtfmt.;
RUN;
PROC PRINT data = random1 NOOBS;
        title 'Random Assignment for CRD with One Factor';
RUN

Random Assignment for CRD with One Factor
Unit	random	group
11	0.00602	Placebo
14	0.14366	Placebo
18	0.18030	Placebo
4	0.20396	Placebo
12	0.21271	Placebo
29	0.21515	Placebo
9	0.25440	Placebo
3	0.29567	Placebo
21	0.32816	Placebo
22	0.33889	Placebo
17	0.44446	Drug A
19	0.47514	Drug A
5	0.49087	Drug A
23	0.50231	Drug A
28	0.52765	Drug A
25	0.53381	Drug A
24	0.55448	Drug A
6	0.60245	Drug A
8	0.60772	Drug A
20	0.61191	Drug A
16	0.69616	Drug B
7	0.69824	Drug B
1	0.70305	Drug B
10	0.71145	Drug B
13	0.71217	Drug B
15	0.86676	Drug B
27	0.96330	Drug B
26	0.97864	Drug B
30	0.98660	Drug B
2	0.99081	Drug B

Okay, let's first launch and run the SAS program, so you can review the resulting output to convince yourself that the code did indeed generate the desired treatment plan. You should see that 10 of the subjects were randomly assigned to treatment A, 10 to treatment B, and 10 to the placebo.

Now, let's walk ourselves through the program to make sure we understand how it works. The first DATA step merely uses a simple DO loop to create a temporary data set called exper1 that contains one observation for each of the experimental units (in our case, the experimental units are subjects). The only variable in the data set, unit, contains an arbitrary label 1, 2, ..., 30 assigned to each of the experimental units.

The remainder of the code generates a random assignment. To do so, the code from Example 34.5 is simply extended. That is:

The second DATA step uses the ranuni function to generate a uniform random number between 0 and 1 for each observation in the exper1 data set. The result is stored in a temporary data set called random1.
The random1 data set is sorted in order of the random number.
The third DATA step uses an IF-THEN-ELSE construct to assign the first ten units in sorted order to Group 1, the second ten to Group 2, and the last ten to Group 3.
A FORMAT is defined to label the groups meaningfully.
The final randomization list is printed.

Note! The randomization list created here contains information that is potentially damaging to the success of the whole study if it ended up in the wrong hands. That is, blinding would be violated. It is a better (and more common) practice to keep separate master lists that associate unit with the subject's name, and group numbers with treatment names. In many national trials, it is common to have statisticians also blinded from the master list, producing a "triple-blind" trial. I formatted the treatment here just for illustration purposes only.

Example 34.16 Section

To create a random assignment for a completely randomized design with two factors, you can just modify the IF statement in the previous example. The following program generates a random assignment of treatments to 30 subjects, in which Factor A has 2 levels and Factor B has 3 levels (and hence 6 treatments). The code is similar to the code from the previous example except the IF statement now divides the 30 subjects into 6 treatment groups and (arbitrarily) assigns the levels of factors A and B to the groups:

DATA random2;
    set exper1;
    random=ranuni(4901);
RUN;
PROC SORT data=random2;
    by random;
RUN;
DATA random2;
   set random2;
    	if _N_ le 5 then 
    		do;  factorA = 1; factorB = 1;  end;
   else if _N_ gt  5 and _N_ le 10 then 
    		do;  factorA = 1; factorB = 2;  end;
   else if _N_ gt 10 and _N_ le 15 then
    		do;  factorA = 1; factorB = 3;  end;
   else if _N_ gt 15 and _N_ le 20 then 
    		do;  factorA = 2; factorB = 1;  end;
   else if _N_ gt 20 and _N_ le 25 then 
    		do;  factorA = 2; factorB = 2;  end;
   else if _N_ gt 25 and _N_ le 30 then
    		do;  factorA = 2; factorB = 3;  end;
RUN;
PROC PRINT data = random2;
    title 'Random Assignment for CRD with Two Factors';
RUN;

Random Assignment for CRD with Two Factors
Obs	Unit	random	factorA	factorB
1	9	0.04052	1	1
2	21	0.04733	1	1
3	17	0.07038	1	1
4	20	0.11335	1	1
5	19	0.12459	1	1
6	7	0.14093	1	2
7	29	0.23206	1	2
8	10	0.24267	1	2
9	14	0.27161	1	2
10	26	0.28117	1	2
11	4	0.31276	1	3
12	18	0.34512	1	3
13	15	0.37393	1	3
14	28	0.37724	1	3
15	2	0.40480	1	3
16	6	0.42829	2	1
17	23	0.47371	2	1
18	11	0.48031	2	1
19	13	0.48552	2	1
20	12	0.48943	2	1
21	1	0.50155	2	2
22	3	0.53892	2	2
23	16	0.54762	2	2
24	24	0.69272	2	2
25	30	0.74252	2	2
26	5	0.77423	2	3
27	27	0.80270	2	3
28	8	0.82113	2	3
29	25	0.84338	2	3
30	22	0.95571	2	3

First, my apologies for the formatting that makes the IF-THEN-ELSE statement a little difficult to read. I needed to format it as such so that I could easily capture the image of the program for you.

Again, it's probably best if you first launch and run the SAS program, so you can review the resulting output to convince yourself that the code did indeed generate the desired treatment plan. You should see that five of the subjects were randomly assigned to the A=1, B=1 group, five to the A=1, B=2 group, five to the A=1, B=3 group, and so on.

Then, if you compare the code to the code from the previous example, the only substantial difference you should see is the difference between the two IF statements. As previously mentioned, the IF statement here divides the 30 subjects into 6 treatment groups and (arbitrarily) assigns the levels of factors A and B to the groups:

Example 34.17 Section

Thus far, our random assignments have not involved dealing with a blocking factor. As you know, it is natural in some experiments to block some of the experimental units together in an attempt to reduce unnecessary variability in your measurements that might otherwise prevent you from making good treatment comparisons. Suppose, for example, that your workload would prevent you from making more than nine experimental measurements in a day. Then, it would be a good idea then to treat the day as a blocking factor. The following program creates a random assignment for 27 subjects in a randomized block design with one factor having three levels.

 DATA exper2 (drop = j);
    DO block = 1 to 3;
    	DO j = 1 to 9;  
    			if block = 1 then do;  unit = j;      output;  end;
    		else if block = 2 then do;  unit = j + 9;  output;  end;
    		else if block = 3 then do;  unit = j + 18; output;  end;
    	END;
    END;
RUN;
PROC PRINT data=exper2; title 'EXPER2: Definition of Experimental Units'; RUN;
DATA random3;
    set exper2;
    random=ranuni(7214508);
RUN;
PROC SORT data=random3;  by block random;  RUN;
DATA random3;
    set random3;
    by block;
    if first.block then k=0;
    	else k=k+1;
    	if k in (0,1,2) then trt=1;
    else if k in (3,4,5) then trt=2;
    else if k in (6,7,8) then trt=3;
    retain k;
RUN;
PROC PRINT data=random3 noobs;
    title 'Random Assignment for RBD: Sorted in BLOCK-TRT Order';
RUN;
PROC SORT data=random3;   by block unit;  RUN;
PROC PRINT data=random3 noobs;
    title 'Random Assignment for RBD: Sorted in BLOCK-UNIT Order';
RUN;

Again, my apologies about the formatting that makes the program a little more difficult than usual to read. I needed to format it as such so that I could easily capture the image of the program for you.

It's probably going to be best if you first launch and run the SAS program, so you can first review the contents of the initial exper2 data set:

EXPER2: Definition of Experimental Units

Obs	block	unit
1	1	1
2	1	2
3	1	3
4	1	4
5	1	5
6	1	6
7	1	7
8	1	8
9	1	9
10	2	10
11	2	11
12	2	12
13	2	13
14	2	14
15	2	15
16	2	16
17	2	17
18	2	18
19	3	19
20	3	20
21	3	21
22	3	22
23	3	23
24	3	24
25	3	25
26	3	26
27	3	27

and then the resulting output that contains the desired treatment plan... first in block-treatment order:

Random Assignments for RBD: Sorted in BLOCK-TRT order

block	unit	random	k	trt
1	5	0.17083	0	1
1	8	0.18781	1	1
1	6	0.19400	2	1
1	9	0.40043	3	2
1	7	0.58852	4	2
1	4	0.60226	5	2
1	3	0.65488	6	3
1	1	0.79768	8	3
1	2	0.79977	7	3
2	12	0.06810	0	1
2	13	0.08280	1	1
2	16	0.23191	2	1
2	15	0.27690	3	2
2	11	0.38198	4	2
2	14	0.667	5	2
2	18	0.84177	6	3
2	10	0.91906	7	3
2	17	0.93312	8	3
3	21	0.09791	0	1
3	23	0.11455	1	1
3	22	0.21569	2	1
3	20	0.30461	3	2
3	26	0.30534	4	2
3	27	0.32876	5	2
3	24	0.46627	6	3
3	25	0.74756	7	3
3	19	0.91401	8	3

and then in block-unit order:

Random Assignments for RBD: Sorted in BLOCK-UNIT order

block	unit	random	k	trt
1	1	0.79768	8	3
1	2	0.79977	7	3
1	3	0.65488	6	3
1	4	0.60226	5	2
1	5	0.17083	0	1
1	6	0.19400	2	1
1	7	0.58852	4	2
1	8	0.18781	1	1
1	9	0.40043	3	2
2	11	0.38198	4	2
2	10	0.91906	7	3
2	12	0.06810	0	1
2	13	0.08280	1	1
2	14	0.667	5	2
2	15	0.27690	3	2
2	16	0.23191	2	1
2	17	0.93312	8	3
2	18	0.84177	6	3
3	19	0.91401	8	3
3	20	0.30461	3	2
3	21	0.09791	0	1
3	22	0.21569	2	1
3	23	0.11455	1	1
3	24	0.46627	6	3
3	25	0.74756	7	3
3	26	0.30534	4	2
3	27	0.32876	5	2

As you can see, the exper2 data set is created to contain one observation for each of the experimental units (27 subjects here). The variable unit contains an arbitrary label (1, 2, ..., 30) assigned to each of the experimental units. The variable block, which identifies the block number (1, 2, and 3), divides the experimental units into three equal-sized blocks of nine.

Now, to create the random assignment:

We use the ranuni function to generate a uniform random number between 0 and 1 for each observation.
Then, within each block, we sort the data in order of the random number.
Then, we create a counter variable to count the number of observations within each block: for the first observation within each block ("if first.block"), we set the counter (k) to 0; otherwise, we increase the counter by 1 for each observation within the block. (For this to work, we must retain k from iteration to iteration).
Using an IF-THEN-ELSE construct, within each block, assign the first three units in sorted order (k=0,1,2) to group 1, the second three (k=3,4,5) to group 2, and the last three (k=6,7,8) to group 3.

Depending on how the experiment will be conducted, you can print the random assignment in different orders:

First, the randomization is printed in order of treatment within each block. This will accommodate experiments for which it is natural to perform the treatments in groups on the randomized experimental units.
Then, the randomization is printed in order of units within the block. This will accommodate experiments for which it is natural to perform the treatments in random order on consecutive experimental units.