6b.7 - Sample Size and Power
6b.7 - Sample Size and PowerFor a continuous outcome that is approximately normally distributed in an equivalence trial, the number of patients needed in the active control arm, \(n_A\), where \(AR = \dfrac{n_E}{n_A}\), to achieve \(100 \left(1 - \beta \right)\%\) statistical power with an \(\alpha\)-level significance test is approximated by:
\( n_A = \left( \frac{AR+1}{AR}\right) \left(t_{n_{1}+n_{2}-2, 1-\alpha}+ t_{n_{1}+n_{2}-2, 1-\beta}\right)^2 \sigma^2 / \left(\Psi - |\Delta|\right)^2 \)
Notice the difference in the t percentiles between this formula and that for a superiority comparison, described earlier. The difference is due to the two one-sided testing that is performed.
Most investigators assume that the true difference in population means, \(\Delta = \mu_E - \mu_A\), is null in this sample size formula. This is an optimistic assumption and may not be realistic.
For a binary outcome, the zone of equivalence for the difference in population proportions between the experimental therapy and the active control, \(p_E - p_A\), is defined by the interval \((-\Psi, +\Psi)\). The number of patients needed in the active control arm, \(n_A\), where \(AR = \dfrac{n_E}{n_A}\), to achieve \(100 \left(1 - \beta \right)\%\) statistical power with an \(\alpha\) significance test is approximated by:
\( n_A = \left( \frac{AR+1}{AR}\right) \left(z_{1-\alpha}+ z_{1-\beta}\right)^2 \bar{p}(1-\bar{p}) / \left(\Psi - |p_E-p_A|\right)^2 \)
where
\( \bar{p}= \left( AR \cdot p_E+p_A\right) / (AR+1) \)
How does this formula compare to FFDRG p. 189? The choice of the value for p in our text is to use the control group value, assuming, that \(p_e - p_a = 0\).
For a time-to-event outcome, the zone of equivalence for the hazard ratio between the experimental therapy and the active control, \(\Lambda\), is defined by the interval \(\left(\dfrac{1}{\Psi}, +\Psi\right)\), where \(\Psi\) is chosen > 1. The number of patients who need to experience the event to achieve \(100 \left(1 - \beta \right)\%\) statistical power with an \(\alpha\)-level significance test is approximated by
\(E = \left( \frac{(AR+1)^2}{AR}\right) \left(z_{1-\alpha}+ z_{1-\beta}\right)^2 / \left( log_{e} \left(\Psi / \Lambda \right)\right)^2 \)
If \(p_E\) and \(p_A\) represent the anticipated failure rates in the two treatment groups, then the sample sizes can be determined from \(n_A = \dfrac{E}{\left(AR\times p_E + p_A\right)}\) and \(n_E = AR \times n_A\)
If a hazard function is assumed to be constant during the follow-up period [0, T], then it can be expressed as \(\lambda(t) = \lambda = \dfrac{-\text{log}_e(1 - p)}{T}\). In such a situation, the hazard ratio for comparing two groups is \(\Lambda = \dfrac{\text{log}_e\left(1 - p_E\right)}{\text{log}_e\left(1 - p_A\right)}\). The same formula can be applied, with different values of \(p_E\) and \(p_A\), to determine \(\Psi\).
For a continuous outcome that is approximately normally distributed in a non-inferiority trial, the number of subjects needed in the active control arm, \(n_A\), where \(AR = \dfrac{n_E}{n_A}\), to achieve \(100 \left(1 - \beta \right)\%\) statistical power with a \(\alpha\)-level significance test is approximated by:
\( n_A = \left( \frac{AR+1}{AR}\right) \left(t_{n_{1}+n_{2}-2, 1-\alpha}+ t_{n_{1}+n_{2}-2, 1-\beta}\right)^2 \sigma^2 / \left(\Psi - |\Delta|\right)^2 \)
Notice that the sample size formulae for non-inferiority trials are exactly the same as the sample size formulae for equivalence trials. This is because of the one-sided testing for both types of designs (even though an equivalence trial involves two one-sided tests). Also notice that the choice of \(Z_\alpha\) in the formulas above have assumed a one-sided test or two one-sided tests, but the requirements of regulatory agencies and the approach in our FFDRG text is to use the Z value that would have been used for a 2-sided hypothesis test. In homework, be sure to state any assumptions and the approach you are taking.