Lesson 5: Objectives and Endpoints

Overview

The objectives of a trial must be stated in specific terms. Achieving objectives should not depend on observing a particular outcome of the trial, e.g. finding a difference in mean weight loss of exactly 2 kg, but in obtaining a valid result. For example, a randomized trial of 4 diets had as its objective, “To assess adherence rates and the effectiveness of 4 popular diets for weight loss and cardiac risk factor reduction.” (Dansinger et al. 2005).

The endpoints (or outcomes), determined for each study participant, are the quantitative measurements required by the objectives. In the Dansinger weight loss study, the primary endpoint was identified to be mean absolute change from baseline weight at 1 year. In a cancer chemotherapy trial, the clinical objective is usually improved survival. Survival time is recorded for each patient; the primary outcome reported may be median survival time or it could be five-year survival.

Clinical trials typically have a primary objective or endpoint. Additional objectives and endpoints are secondary. The sample size calculation is based on the primary endpoint. Analysis involving a secondary objective has statistical power that is calculated based on the sample size for the primary objective.

"Hard" endpoints are well-defined in the study protocol, definitive with respect to the disease process, and require no subjectivity. "Soft" endpoints are those that do not relate strongly to the disease process or require subjective assessments by investigators and/or patients. Some endpoints fall between these two classifications. For example the grading of x-rays by radiologists and the grading of cell and tissue lesions/tumors by pathologists. There is some degree of subjectivity, but they are valid and reliable endpoints in most settings.

This lesson will help to differentiate between these types of objectives and endpoints. Ready, let's get started!

Objectives

Upon completion of this lesson, you should be able to:

Identify outcomes that are continuous, binary, event times, counts, ordered or unordered categories and repeated measurements.
State the merits and problems of using a surrogate outcome.
Recognize types of censoring that can occur in studies of time-to-event outcomes.
State the components of a typical dose-finding design.

5.1 - Endpoints

The endpoints used in a clinical trial must correspond to the scientific objectives of the study and the methods of outcome assessment should be accurate (free of bias).

A wide variety of endpoints that are used in clinical trials as displayed below.

Continuous measurements: blood pressures, weight, blood chemistry variables
Event times: time to recurrence of cancer, survival time
Counts: frequency of occurrence of migraine headaches, number of uses of rescue meds for asthma
Binary endpoints: no recurrence/recurrence, major cardiac event yes or no
Ordered categories: absent, mild moderate, severe pain, NYHA status
Unordered categories: categories of adverse experiences: GI, cardiac, etc.

Some endpoints are assessed many times during the study, leading to repeated measurements.

5.2 - Special Considerations for Event Times

Event Times

Event times often are useful endpoints in clinical trials. Examples include survival time from onset of diagnosis, time until progression from one stage of disease to another, and time from surgery until hospital discharge. In each case time is measured from study entry until the event occurs. With an endpoint that is based on an event time, there always is the chance of censoring. An event time is censored if there is some amount of follow-up on a subject, but the event is not observed because of loss-to-follow-up, death from a cause other than the trial endpoint, study termination, and other reasons unrelated to the endpoint of interest.. This is known as right censoring and occurs frequently in studies of survival.

Right-censoring example

Consider the table above which displays time until infection for Patients 1-6. In some cases, the event did not occur, Patient 1 (from top) was followed for a year and was censored at the end of the study). The second patient experienced an infection at approximately 325 days. Patients 3 and 6 dropped out of the study and were censored when this occurred.

Left censoring occurs when the initiation time for the subject, such as time of diagnosis, is unknown. Interval censoring occurs when the subject is not followed for a period of time during the trial and it is unknown if the event occurred during that period.

Right Censoring Types

There are three types of right censoring that are described in the statistical literature.

Type I censoring occurs when all subjects are scheduled to begin the study at the same time and end the study at the same time. This type of censoring is common in laboratory animal experiments, but unlikely in human trials.

Type II censoring occurs when all subjects begin the study at the same time and the study is terminated when a predetermined proportion of subjects have experienced the event.

Type III censoring occurs when the censoring is random, which is the case in clinical trials because of staggered entry (not every patient enters the study on the first day) and unequal follow-up on subjects.

Statistical methods appropriate for event time data, survival analyses, do not discard the right-censored observations. Instead, the methods account for the knowledge that the event did not occur in a subject up to the censoring time. Survival methods include life table analysis, Kaplan-Meier survival curves, logrank and Wilcoxon tests, and proportional hazards regression (more discussion on these in a later lesson).

In order to conduct event-time analyses, two measurements must be recorded, namely, the follow-up time for a subject and an indicator variable as to whether this is an event time or a censoring time. These statistical methods assume that the censoring mechanisms and the event are independent. If this is not the case, e.g., patients have a tendency to be censored prior to the occurrence of the event, the event rate will be underestimated.

When the event of interest is death, it is common to examine two different endpoints, namely, death from all causes and death primarily due to the disease.

At first glance, death primarily due to the disease appears to be the most appropriate. It is, however, susceptible to bias because the assumption of independent causes of death may not be valid. For example, subjects with a life-threatening cancer are prone to death due to myocardial infarction. It can also be very difficult to determine the exact cause of death.

5.3 - Surrogate Endpoints

A surrogate endpoint is one that is measured in place of the biologically definitive or clinically meaningful endpoint. A surrogate endpoint usually tracks the progress or extent of the disease.

Investigators choose a surrogate endpoint when the definitive endpoint is inaccessible due to cost, time, or difficulty of measurement. The problem with a surrogate endpoint in a clinical trial is determining whether it is valid (i.e., is it strongly associated with the definitive outcome?)

Piantadosi (2005) gives the following characteristics of a useful surrogate endpoint:

It can be measured simply and without invasive procedures
It is related to the causal pathway for the definitive endpoint
It yields the same statistical inference as that for the definitive endpoint
It should be responsive to the effects of treatments

Disease

>>>

Surrogate
Endpoint

>>>

Definitive
Endpoint

The disease affects the surrogate endpoint, which in turn affects the definitive endpoints.

Examples of surrogate endpoints include CD4 counts in AIDS patients, tumor size reduction in cancer patients, blood pressure in cardiovascular disease, and intraocular pressure in glaucoma patients. The response variables in translational research are surrogate endpoints.

Surrogate endpoints can potentially shorten and increase the efficiency of clinical trials. If, however, the surrogate is imprecisely associated with definitive endpoints, the use of the surrogate can lead to misleading results.

5.4 - Considerations for Dose Finding Studies

The terms describing several types of early clinical studies are given below.

Treatment Mechanism: Early developmental trial that investigates mechanism of treatment effect, e.g., a pharmacokinetics study of absorption and elimination of the drug from the human body

Phase I: Imprecise term for dose-ranging studies

Dose-escalation: Design or component of a design that specifies methods for increases in dose for subsequent subjects

Dose-ranging: Design that tests some or all of a prespecified set of doses (fixed design points)

Dose-finding: Design that titrates dose to a prespecified optimum based on biological or clinical considerations

Definitions from Piantodosi (2005)

Dose-finding (DF) trials are Phase I studies with the objective of determining the optimal biological dose (OBD) of a drug. In order to determine the dose with the highest potential for efficacy in the patient population that still meets safety criteria, dose-finding studies are typically conducted by administering sequentially rising doses to successive groups of individuals. Such studies may be conducted in healthy volunteers or in patients with the disease.

A question the investigator must answer in designing a dose-finding study is how to characterize an optimum dose. Should the optimum dose be selected on the basis of the highest therapeutic index (the maximal separation between risk and benefit)? Or is the optimal dose the level that maximizes therapeutic benefit while maintaining risk below a predetermined threshold? What measures will denote risk and benefit?

An optimal dose can be selected on the basis of efficacy alone, such as when a minimum effective dose (MED) is chosen for a pain-relieving medication and defined as the dose which eliminates mild-to-moderate pain in 80% of trial participants. In another case, the optimal dose might be selected as the highest dose that is associated with serious side effects in no more than 1 of 20 patients. This would be a maximum nontoxic dose (MND). In cancer therapeutics, the optimal dose for a cytotoxic drug designed to shrink tumors could be defined as the level that yields serious but reversible toxicity in no more than 30% of the patients. This is a maximum tolerated dose (MTD). Care in defining the conditions for optimality is critical to a dose-finding study.

Most DF trials are sequential studies such that the number of subjects is itself an outcome of the trial. Convincing evidence characterizing the relationship between dose and safety can be obtained after studying a small set of patients. Hence sample size is not a major concern DF trials.

An idealized DF study would be similar to an animal bioassay design, with K fixed doses at increasing levels, d₁, d₂, ... , d_K. The hypothesized optimal dose would lie between d₁ and d_K. The n participants would be randomized to each of the K dose groups and the binary response of toxicity would be noted for each participant. A mathematical model could then be fit to the proportional responses over the doses such that the optimal dose could be determined.

Would you agree to participate in such a study? Think carefully ...

Most likely, your answer is no because you would not want to risk being assigned to the highest dose level of this unproven drug as your first treatment. There is a principle here: it is unethical to treat humans at high doses of a drug without any prior knowledge of their responses at lower levels. Furthermore, ethics compel a design that minimizes the number of patients treated with both low ineffective doses and high toxic doses.

Thus, along with defining optimality, a DF study design usually includes a method for determining the starting dose for the patient, specification of dose increments and cohort sizes, definition of dose-limiting toxicities as well as the decision rules for escalation and de-escalation of the dose.

Continual Reassessment Method

The continual reassessment method (CRM) allows fitting a mathematical model to observed data during the study from which it estimates an optimal dose via extrapolation or interpolation. The next cohort of patients is assigned to the estimated optimal dose. A study using CRM would not have an a priori defined set of doses; thus is dose-finding study. The CRM itself can be thought of an algorithm for updating the best guess regarding the optimal dose. Bayesian approaches have also been incorporated into the CRM and the method is applicable for many types of responses.

In contrast to a dose-finding study, a dose-ranging study uses pre-specified design points.

Fibonacci Dose-Ranging Designs

Fibonacci, a thirteenth-century Italian mathematician, popularized the number sequence 1, 1, 2, 3, 5, 8, 13, 21, 34 ... (a number in the sequence is the sum of the two previous numbers).

A fixed dosing scheme can be based on the Fibonacci sequence. For example, the first cohort of n participants is assigned dose D, the initial dose. If they tolerate this dose D well, the next cohort of n participants is assigned dose 2D. If all goes well with the second cohort, then a third cohort is assigned dose 3D, the fourth cohort is assigned does 5D, etc. The process is discontinued when one of the cohorts exhibits toxicity. Numerous modifications have been proposed to the Fibonacci scheme, such as allowing for de-escalation as well as escalation.

5.5 - Summary

In this lesson, among other things, we learned how to:

identify outcomes that are continuous, binary, event times, counts, ordered or unordered categories, and repeated measurements.
state the merits and problems of using a surrogate outcome.
recognize types of censoring that can occur in studies of time-to-event outcomes.
state the components of a typical dose-finding design.

Look for any homework assignments listed for this lesson in the Canvas course site...

^[1]	Link
↥	Has Tooltip/Popover
	Toggleable Visibility