Lesson 10: Missing Data and Intent-to-Treat

Lesson 10: Missing Data and Intent-to-Treat


Data imperfections can be classified as protocol non-adherence, missing or incomplete observations, or methodologic errors. Two different perspectives have been applied to address the situation with imperfect data.

  • The explanatory approach corresponds to acquiring information and determining biological effects, which typically is done for efficacy studies.
  • The pragmatic approach corresponds to interpreting results according to general use, which typically is done for effectiveness studies.

Distinguishing between the two approaches is useful in determining how to deal with data imperfections.


Upon completion of this lesson, you should be able to:

  • Apply the terms evaluable and inevaluable to patients in a clinical trial appropriately.
  • Differentiate between a pragmatic approach and an explanatory approach and the conditions for which each is appropriate.
  • State the conditions under which data imputation is reasonable.
  • State the advantage of multiple imputation over simple data imputation.
  • Recognize the conditions under which the International Conference on Harmonization would allow a patient’s data to be excluded from statistical analysis.

10.1 - Specific Data Imperfections

10.1 - Specific Data Imperfections


Protocols sometimes contain improper plans which create or exacerbate imperfections in the data. One problem in this regard involves the evaluation of which patients received the correct treatment at the correct amounts. Patients who meet certain criteria are said to be "evaluable".

As an example, consider an SE trial in which \(N_E\) patients are considered evaluable and \(N_I\) patients are considered inevaluable. Suppose that the numbers of evaluable and inevaluable patients with favorable outcomes are \(R_E\) and \(R_I\), respectively. You may consider using one of the following estimates for the probability of a favorable outcome, namely,

\(P=\frac{R_E+R_I}{N_E+N_I} \text{ and }P_E=\frac{R_E}{N_E}\)

P (pragmatic approach) is based on all patients, (intention-to-treat) whereas \(P_E\) (explanatory approach) is based on evaluable patients only. Usually \(R_I\) is close to zero so that \(P_E > P\).

\(P_E\) may appear a more appropriate estimate of the clinical effect since it is obvious that the treatment cannot have an effect if it is not received. Some investigators will write a protocol to indicate that only data from those who received treatment for at least some number of doses or longer than a particular length of time will be used in the analysis. Do you recall a major difficulty with this explanatory approach? What about post-entry exclusion bias?

Since evaluability criteria define inclusion retroactively based on treatment adherence which is not determined until completion of the study, there is potential post-entry exclusion bias. Participant data should not be selected for inclusion in data analysis based on an outcome variable.

The pragmatic approach does not encounter such difficulties, although obviously it does not help elicit biological effects in an efficacy trial. It is prudent then, to select treatments and a protocol design that will result in a high level of treatment adherence with the hope that the pragmatic/intention-to-treat approach agrees as much as possible with the explanatory approach.

Missing data

Usually, unrecorded data imply that methodologic errors have occurred. If this happens frequently, there could be a fundamental problem with the design or conduct of the study. Some missing data are due to human error, such as forgetting to record/enter the data

In longitudinal clinical trials, some patients may be lost to follow-up. If losses to follow-up occur for reasons not associated with outcome, then they have little impact, other than reducing precision. If losses to follow-up occurred independently of the outcome, then the explanatory and pragmatic approaches would be equivalent. Investigators, however, cannot assume that all losses to follow-up are random events and conduct analyses that ignore such losses. Being lost to follow-up may be associated with a higher chance of disease progression, recurrence, or death. If a patient has not withdrawn consent, then every effort should be made to recover lost information.

There are three generic approaches to handling missing data values:

  1. disregard the observations that contain missing values;
  2. disregard the outcome variable if it has a high proportion of missing values;
  3. replace the missing values by appropriate values (data imputation).

Data imputation is a reasonable approach under certain circumstances:

  1. the frequency of missingness is relatively small (say less than 10%);
  2. the outcome variable with the missing values is important clinically or biologically;
  3. reasonable strategies for the data imputation exist;
  4. the sensitivity of the conclusions to different data imputation strategies can be determined.

Simple data imputation involves substituting one data point for each missing value. Some substitution choices include the mean of the non-missing values or a predicted value from a linear regression model.

Another simple data imputation method is the last observation carried forward (LOCF) approach in longitudinal studies. With LOCF, the last observed value for a patient is substituted for all of that patient’s subsequent missing values.

The problems with simple data imputation methods are that they can yield a very biased result and they tend to underestimate variability during the data analysis.

Multiple imputation methods are preferred, in which

  1. imputations are generated, usually via a regression model, and random errors are added to the predicted values via random number generators,
  2. multiple imputed data sets are created in this manner (say 10-20 data sets), and
  3. the results are averaged across the multiple data sets.

In most clinical trials, it is common to find errors that yield ineligible patients participating in the trial. Objective eligibility criteria are less susceptible to error than subjective criteria. Also, patients can fail to comply with nearly every aspect of treatment specification, such as reduced or missed doses and improper dose scheduling.

Ineligible patients in the study can be:

  1. included in the analysis of the cohort of eligible patients (pragmatic approach/intention-to-treat)
  2. excluded from the analysis (explanatory approach).

In a randomized trial, if the eligibility criteria are objective and assessed prior to randomization, then both approaches do not cause bias. The pragmatic approach, however, increases the external validity.

10.2 - Intention-to-Treat

10.2 - Intention-to-Treat

Intention-to-treat (ITT) is the principle that patients in a randomized clinical trial should be analyzed according to the group to which they were assigned, even if they did not

  1. receive the intended treatment,
  2. did not adhere to the treatment regimen, or
  3. comply with the protocol in any manner.

The ITT Principle is a generalization of the pragmatic approach while the Treatment received (TR) or a protocol analysis is the principle that patients should be analyzed according to the treatment they actually received.

Most statisticians favor the ITT principle because it yields the best properties for the test of the null hypothesis of no treatment difference. "If randomized, then analyzed" is the view widely held among clinical trial statisticians and considered a critical component of the ITT Principle to avoid biases due to post-randomization exclusions. ITT also is favored by the federal agencies because a clinical trial is a test of treatment policy, not a test of treatment received. After a meeting to discuss clinical trials methodology, which included US FDA representatives, the International Conference on Harmonization (ICH) published a document entitled "Statistical Principles for Clinical Trials (E9)" that discusses the ITT Principle under various circumstances.

According to the E9 document there are a limited number of circumstances in which randomized patients can be excluded from the full analysis set. Patients who failed to satisfy an entry criterion may be excluded from the full analysis set only under the following circumstances:

  1. The entry criterion was measured prior to randomization
  2. The detection of the relevant eligibility violations can be objectively determined
  3. All patients underwent similar scrutiny for eligibility violations
  4. All patients with detected violations of the eligibility criterion are excluded

Although the ITT principle generally is preferred, it can be misleading in some circumstances. For example, consider a situation in which a new therapy is compared to placebo. Suppose that a patient undergoing treatment failure is provided emergency medications for safety purposes. If the placebo group has a higher failure rate, it actually could appear to be more beneficial than the new therapy in an ITT analysis because of the emergency medications (even though this may seem to be a design flaw of the trial). In such a situation, the statistical analysis would be better served with time to treatment failure as the primary endpoint. This analysis would still include all patients, but using time to failure as the primary endpoint eliminates the problem of misleading results from the ITT analysis.

Many factors can contribute to a patient's failure to complete the intended therapy, including severe adverse reactions, disease progression, patient or physician preference for an alternative treatment, and a change of mind. In nearly all of these circumstances, failure to complete the assigned therapy is partially a trial outcome. Patients cannot be eliminated from analysis for such reasons without introducing bias.

10.3 - Summary

10.3 - Summary

In this lesson, among other things, we learned:

  • Apply the terms evaluable and inevaluable to patients in a clinical trial appropriately.
  • Differentiate between a pragmatic approach and an explanatory approach and the conditions for which each is appropriate.
  • State the conditions under which data imputation is reasonable.
  • State the advantage of multiple imputation over simple data imputation.
  • Recognize the conditions under which the International Conference on Harmonization would allow a patient’s data to be excluded from statistical analysis.

Has Tooltip/Popover
 Toggleable Visibility