6.1 - Experimental Unit and Replication6.1 - Experimental Unit and Replication
First we need to define what is meant by an ‘experimental unit’. It may seem trivial, but when encountering an experimental situation this critical step is not always obvious. Consider a situation where someone wants to evaluate polluted stream water for its effect on fish lesions. They set up 2 aquaria, each with 50 fish. They randomly assign a water treatment (polluted vs. control) to each of the aquaria. After 30 days, they catch 10 fish from each aquarium and count the number of lesions. The treatment design is a single-factor design with 2 levels of water treatment, and a one-way ANOVA can be run on the data. But what is the experimental unit?
Going back to our definition, the experimental unit is that which receives the treatment. In this case, we have applied a water treatment to each aquarium. The fish are not the experimental units. In order for individual fish to be experimental units, somehow the investigators would have to take one fish at a time and apply the treatment independently to each fish. This would be impractical from a logistics standpoint, and was not done. Instead, the water treatment levels were applied to entire aquarium, and so the experimental unit is an aquarium with 50 fish.
Now we can determine what constitutes a replication of the experiment. Each time the full set of treatment levels (2) are applied, we have a complete replication. Here in the experiment described, there is only one replication, a situation often described as ‘an un-replicated study’.
The individual fish that were caught and counted for lesions are sampling units. Traditionally, to obtain a correct ANOVA, mean values of the sampling units have to be computed for each experimental unit before calculation of the treatment SS. Failure to recognize sampling units can result in a serious problem: pseudo-replication. Pseudo-replication results from treating each sampling unit as if it were an experimental unit and inflating the Error d.f.. By artificially increasing the Error df., we reduce the MSE and produce a larger (incorrect) F statistic.