In many clinical trials involving serious diseases, such as cancer and AIDS, a primary objective is to evaluate the survival experience of the cohort. In clinical trials not involving serious diseases, survival may not be an outcome, but other time-to-event outcomes may be important. Examples include time to hospital discharge, time to disease relapse, time to getting another migraine, time to progression of disease, etc.
The Kaplan-Meier survival curve is a nonparametric technique for estimating the probability of survival, even in the presence of censoring (e.g. study is completed before the patient experiences the event), at any point in time. This statistical approach is nonparametric because it does not assume any particular distribution for the data, such as lognormal, exponential, or Weibull. It is a "robust" procedure because it is not adversely affected by one or more unusual data points.
In order to construct the Kaplan-Meier survival curve, the actual failure times need to be ordered from smallest to largest. In a sample size of n patients, denote these times of failure as \(t_1, t_2, \dots , t_K\). For convenience, let \(t_0 = 0\) denote the start time and let \(t_K+1 = ∞\).
At the \(k^th\) failure time, \(t_k\), the number of failures, \(d_k\), are noted as well as the number of patients who were at risk for failure immediately prior to \(t_k, n_k\). Notice that patients who are lost to follow-up (censored) prior to time \(t_j\) are not included in \(n_k\).
The algebraic formula for the Kaplan-Meier survival probability at time t is:
\(\hat{S}(t)=1, t_0 \le t \le t_1\)
\(\hat{S}(t)= \prod_{k'=1}^{k}\left( 1-\frac{d_{k'}}{n_{k'}} \right), t_k \le t \le t_{k+1}, k=1, 2, \dots , K \)
The calculation of S(t) utilizes conditional probability: the probability of surviving at time t, given that the person has survived up to time t. The Kaplan-Meier curve depicts S(t), the probability of surviving beyond time t.
An example with an initial sample of n = 100 patients is as follows:
k | \(t_k\) (days) |
\(d_k\) |
\(n_k\) |
\(\hat{S}(t_k)\) | probability |
1 | 127 | 1 | 98 | 0.99 = (1 - 1/98) | probability of surviving beyond day 127 |
2 | 154 | 2 | 91 | 0.97 = (1 - 1/98)(1 - 2/91) | probability of surviving beyond day 154 |
3 | 195 | 1 | 84 | 0.96 = (1 - 1/98)(1 - 2/91)(1 - 1/84) | probability of surviving beyond day 195 |
4 | 221 | 3 | 75 | 0.92 = (1 - 1/98)(1 - 2/91)(1 - 1/84)(1 - 3/75) | probability of surviving beyond day 221 |
Note that the probability estimate does not change until a failure event occurs. Also, censored values do not affect the numerator, but do affect the denominator. Thus, the Kaplan-Meier survival curve gives the appearance of a step function when graphed.
A graphical display of the Kaplan-Meier survival curve is as follows:
Each step down represents the occurrence of an event.