In experimental design terminology, the "experimental unit" is randomized to the treatment regimen and receives the treatment directly. The "observational unit" has measurements taken on it. In most clinical trials, the experimental units and the observational units are one and the same, namely, the individual patient
One exception to this is a community intervention trial in which communities, e.g., geographic regions, are randomized to treatments. For example, communities (experimental units) might be randomized to receive different formulations of a vaccine, whereas the effects are measured directly on the subjects (observational units) within the communities. The advantages here are strictly logistical - it is simply easier to implement in this fashion. Another example occurs in reproductive toxicology experiments in which female rodents are exposed to a treatment (experimental units) but measurements are taken on the pups (observational units).
In experimental design terminology, factors are variables that are controlled and varied during the course of the experiment. For example, treatment is a factor in a clinical trial with experimental units randomized to treatment. Another example is pressure and temperature as factors in a chemical experiment.
Most clinical trials are structured as one-way designs, i.e., only one factor, treatment, with a few levels.
Temperature and pressure in the chemical experiment are two factors that comprise a two-way design in which it is of interest to examine various combinations of temperature and pressure. Some clinical trials may have a two-way factorial design, such as in oncology where various combinations of doses of two chemotherapeutic agents comprise the treatments. An incomplete factorial design may be useful if it is inappropriate to assign subjects to some of the possible treatment combinations, such as no treatment (double placebo). We will study factorial designs in a later lesson.
A parallel design refers to a study in which patients are randomized to a treatment and remain on that treatment throughout the course of the trial. This is a typical design. In contrast, with a crossover design patients are randomized to a sequence of treatments and they cross over from one treatment to another during the course of the trial. Each treatment occurs in a time period with a washout period in between. Crossover designs are of interest since with each patient serving as their own control, there is potential for reduced variability. However, there are potential problems with this type of design. There should be investigation into possible carry-over effects, i.e. the residual effects of the previous treatment affecting subject’s response in the later treatment period. In addition, only conditions that are likely to be similar in both treatment periods are amenable to crossover designs. Acute health problems that do not recur are not well-suited for a crossover study. We will study crossover design in a later lesson.
Randomization is used to remove systematic error (bias) and to justify Type I error probabilities in experiments. Randomization is recognized as an essential feature of clinical trials for removing selection bias.
Selection bias occurs when a physician decides treatment assignment and systematically selects a certain type of patient for a particular treatment.. Suppose the trial consists of an experimental therapy and a placebo. If the physician assigns healthier patients to the experimental therapy and the less healthy patients to the placebo, the study could result in an invalid conclusion that the experimental therapy is very effective.
Blocking and stratification are used to control unwanted variation. For example, suppose a clinical trial is structured to compare treatments A and B in patients between the ages of 18 and 65. Suppose that the younger patients tend to be healthier. It would be prudent to account for this in the design by stratifying with respect to age. One way to achieve this is to construct age groups of 18-30, 31-50, and 51-65 and to randomize patients to treatment within each age group.
Age | Treatment A | Treatment B |
18 - 30 | 12 | 13 |
31 - 50 | 23 | 23 |
51-65 | 6 | 7 |
It is not necessary to have the same number of patients within each age stratum. We do, however, want to have a balance in the number on each treatment within each age group. This is accomplished by blocking, in this case, within the age strata. Blocking is a restriction of the randomization process that results a balance of numbers of patients on each treatment after a prescribed number of randomizations. For example, blocks of 4 within these age strata would mean that after 4, 8, 12, etc. patients in a particular age group had entered the study, the numbers assigned to each treatment within that stratum would be equal.
If the numbers are large enough within a stratum, a planned subgroup analysis may be performed. In the example, the smaller numbers of patients in the upper and lower age groups would require care in the analyses of these sub-groups specifically. However, with the primary question as to the effect of treatment regardless of age, the pooled data in which each sub-group is represented in a balanced fashion would be utilized for the main analysis.
Even ineffective treatments can appear beneficial in some patients. This may be due to random fluctuations, or variability in the disease. If, however, the improvement is due to the patient’s expectation of a positive response, this is called a "placebo effect". This is especially problematic when the outcome is subjective, such as pain or symptom assessment. The placebo effect is widely recognized and must be removed in any clinical trial. For example, rather than constructing a nonrandomized trial in which all patients receive an experimental therapy, it is better to randomize patients to receive either the experimental therapy or a placebo. A true placebo is an inert or inactive treatment that mimics the route of administration of the real treatment, e.g., a sugar pill.
Placebos are not acceptable ethically in many situations, e.g., in surgical trials. (Although there have been instances where 'sham' surgical procedures took place as the 'placebo' control.) When an accepted treatment already exists for a serious illness such as cancer, the control must be an active treatment. In other situations, a true placebo is not physically possible to attain. For example, a few trials investigating dimethyl sulfoxide (DMSO) for providing muscle pain relief were conducted in the 1970’s and 1980’s. DMSO is rubbed onto the area of muscle pain but leaves a garlicky taste in the mouth, so it was difficult to develop a placebo.
Treatment masking or blinding is an effective way to ensure objectivity of the person measuring the outcome variables. Masking is especially important when the measurements are subjective or based on self-assessment. Double-masked trials refer to studies in which both investigators and patients are masked to the treatment. Single-masked trials refer to the situation when only patients are masked. In some studies, statisticians are masked to treatment assignment when performing the initial statistical analyses, i.e., not knowing which group received the treatment and which is the control until analyses have been completed. Even a safety-monitoring committee may be masked to the identity of treatment A or B, until there is an observed trend or difference that should evoke a response from the monitors. In executing a masked trial great care will be taken to keep the treatment allocation schedule securely hidden from all except those with a need to know which medications are active and which are placebo. This could be limited to the producers of the study medications, and possibly the safety monitoring board before study completion. There is always a caveat for breaking the blind for a particular patient in an emergency situation.
As with placebos, masking, although highly desirable, is not always possible. For example, one could not mask a surgeon to the procedure he is to perform. Even so, some have gone to great lengths to achieve masking. For example, a few trials with cardiac pacemakers have consisted of every eligible patient undergoing a surgical procedure to be implanted with the device. The device was "turned on" in patients randomized to the treatment group and "turned off" in patients randomized to the control group. The surgeon was not aware of which devices would be activated.
Investigators often underestimate the importance of masking as a design feature. This is because they believe that biases are small in relation to the magnitude of the treatment effects (when the converse usually is true), or that they can compensate for their prejudice and subjectivity.
Confounding is the effect of other relevant factors on the outcome that may be incorrectly attributed to the difference between study groups.
Here is an example: An investigator plans to assign 10 patients to treatment and 10 patients to control. There will be a one-week follow-up on each patient. The first 10 patients will be assigned treatment on March 01 and the next 10 patients will be assigned control on March 15. The investigator may observe a significant difference between treatment and control, but is it due to different environmental conditions between early March and mid-March? The obvious way to correct this would be to randomize 5 patients to treatment and 5 patients to control on March 01, followed by another 5 patients to treatment and the 5 patients to control on March 15.
Validity Section
A trial is said to possess internal validity if the observed difference in outcome between the study groups is real and not due to bias, chance, or confounding. Randomized, placebo-controlled, double-blinded clinical trials have high levels of internal validity.
External validity in a human trial refers to how well study results can be generalized to a broader population. External validity is irrelevant if internal validity is low. External validity in randomized clinical trials is enhanced by using broad eligibility criteria when recruiting patients .
Large simple and pragmatic trials emphasize external validity. A large simple trial attempts to discover small advantages of a treatment that is expected to be used in a large population. Large numbers of subjects are enrolled in a study with simplified design and management. There is an implicit assumption that the treatment effect is similar for all subjects with the simplified data collection. In a similar vein, a pragmatic trial emphasizes the effect of a treatment in practices outside academic medical centers and involves a broad range of clinical practices.
Studies of equivalency and noninferiority have different objectives than the usual trial which is designed to demonstrate superiority of a new treatment to a control. A study to demonstrate non-inferiority aims to show that a new treatment is not worse than an accepted treatment in terms of the primary response variable by more than a pre-specified margin. A study to demonstrate equivalence has the objective of demonstrating the response to the new treatment is within a prespecified margin in both directions. We will learn more about these studies when we explore sample size calculations.