The main disadvantage of a crossover design is that *carryover effects* may be aliased (confounded) with direct treatment effects, in the sense that these effects cannot be estimated separately. You think you are estimating the effect of treatment A but there is also a bias from the previous treatment to account for. Significant carryover effects can bias the interpretation of data analysis, so an investigator should proceed cautiously whenever he/she is considering the implementation of a crossover design.

A **carryover effect** is defined as the effect of the treatment from the previous time period on the response at the current time period. In other words, if a patient receives treatment A during the first period and treatment B during the second period, then measurements taken during the second period could be a result of the direct effect of treatment B administered during the second period, and/or the carryover or residual effect of treatment A administered during the first period. These carryover effects yield statistical bias.

What can we do about this carryover effect?

The incorporation of lengthy *washout periods* in the experimental design can diminish the impact of carryover effects. A washout period is defined as the time between treatment periods. Instead of immediately stopping and then starting the new treatment, there will be a period of time where the treatment from the first period where the drug is washed out of the patient's system.

The rationale for this is that the previously administered treatment is “washed out” of the patient and, therefore, it can not affect the measurements taken during the current period. This may be true, but it is possible that the previously administered treatment *may have altered the patient in some manner* so that the patient will react differently to any treatment administered from that time onward. An example is when a pharmaceutical treatment causes permanent liver damage so that the patients metabolize future drugs differently. Another example occurs if the treatments are different types of educational tests. Then subjects may be affected permanently by what they learned during the first period.

How long of a washout period should there be?

In a trial involving pharmaceutical products, the length of the washout period usually is determined as some multiple of the half-life of the pharmaceutical product within the population of interest. For example, an investigator might implement a washout period equivalent to 5 (or more) times the length of the half-life of the drug concentration in the blood. The figure below depicts the half-life of a hypothetical drug.

Actually, it is not the presence of carryover effects per se that leads to aliasing with direct treatment effects in the AB|BA crossover, but rather the presence of differential carryover effects, i.e., the carryover effect due to treatment A differs from the carryover effect due to treatment B. If the carryover effects for A and B are equivalent in the AB|BA crossover design, then this common carryover effect is not aliased with the treatment difference. So, for crossover designs, when the carryover effects are different from one another, this presents us with a significant problem.

In the example of the educational tests, differential carryover effects could occur if test A leads to more learning than test B. Another situation where differential carryover effects may occur is in clinical trials where an active drug (A) is compared to placebo (B) and the washout period is of inadequate length. The patients in the AB sequence might experience a strong A carryover during the second period, whereas the patients in the BA sequence might experience a weak B carryover during the second period.

The recommendation for crossover designs is to avoid the problems caused by differential carryover effects at all costs by employing lengthy washout periods and/or designs where treatment and carryover are not aliased or confounded with each other. It is always much more prudent to address a problem *a priori* by using a proper design rather than a *posteriori* by applying a statistical analysis that may require unreasonable assumptions and/or perform unsatisfactorily. You will see this later on in this lesson...

For example, one approach for the statistical analysis of the 2 × 2 crossover is to conduct a preliminary test for differential carryover effects. If this is significant, then only the data from the first period are analyzed because the first period is free of carryover effects. Essentially you are throwing out half of your data!

If the preliminary test for differential carryover is not significant, then the data from both periods are analyzed in the usual manner. Recent work, however, has revealed that this 2-stage analysis performs poorly because the unconditional Type I error rate operates at a much higher level than desired. We won't go into the specific details here, but part of the reason for this is that the test for differential carryover and the test for treatment differences in the first period are highly correlated and do not act independently.

Even worse, this two-stage approach could lead to losing one-half of the data. If differential carryover effects are of concern, then a better approach would be to use a study design that can account for them.

Prior to the development of a general statistical model and investigations into its implications, we require more definitions.