Example Section
Suppose the rate of disease in an unexposed population is 10/100 person-years. You hypothesize an exposure has a relative risk of 2.0. How many persons must you enroll assuming half are exposed and half are unexposed to detect this increased risk, with alpha of 0.05 and power of 90%?
Formula Section
We are interested in testing the following hypothesis:
\(\begin{array}{l}
\mathrm{H}_{0}\colon \pi_{1}=\pi_{2} \\
\mathrm{H}_{1}\colon \pi_{1}-\pi_{2}=\delta
\end{array}\)
But it is usually more convenient to consider the ratio (i.e. relative risk = λ), so we can consider this hypothesis:
\(\begin{array}{l}
\mathrm{H}_{0}: \pi_{1}=\pi_{2} \\
\mathrm{H}_{1}: \pi_{1} / \pi_{2}=\lambda
\end{array}\)
The formulas needed to calculate the total sample size are:
\(\displaystyle{n=\frac{r+1}{r(\lambda-1)^{2} \pi^{2}}\left[z_{\alpha} \sqrt{(r+1) p_{c}\left(1-p_{c}\right)}+z_{\beta} \sqrt{\lambda \pi(1-\lambda \pi)+r \pi(1-\pi)}\right]^{2}}\),
and
\(\displaystyle{p_{c}=\frac{\pi(r \lambda+1)}{r+1}}\)
where
\(\pi=\pi 2\) is the proportion in the reference group
\(\mathrm{r}=\mathrm{n}_{1} / \mathrm{n}_{2}\) (ratio of sample sizes in each group)
\(p_{o}=\) the common proportion over the two groups
When r = 1 (equal-sized groups), the formula above reduces to:
\(p_{c}=\frac{\pi(\lambda+1)}{2}=\frac{\pi_{1}+\pi_{2}}{2}\)
For our example, n=448 - that is 224 in each group.
The table below can also be used to estimate the sample size:
These tables give requirements for a one-sided test directly. For two-sided tests, use the table corresponding to half the required significance level. Note that \(\pi\) is the proportion for the reference group (the denominator) and \(\lambda\) is the relative risk to be tested. | |||||||||
---|---|---|---|---|---|---|---|---|---|
(a) 5% significance, 90% power \(\pi\) |
|||||||||
\(\lambda\) | 0.001 | 0.005 | 0.010 | 0.050 | 0.100 | 0.150 | 0.200 | 0.500 | 0.900 |
0.10 | 23 244 | 4 636 | 2 310 | 488 | 216 | 138 | 100 | 30 | 8 |
0.20 | 32 090 | 6 398 | 3 188 | 618 | 298 | 190 | 136 | 40 | 10 |
0.30 | 45 406 | 9 052 | 4 508 | 874 | 418 | 268 | 192 | 56 | 14 |
0.40 | 66 554 | 13 268 | 6 606 | 1 278 | 612 | 390 | 278 | 78 | 18 |
0.50 | 102 678 | 20 466 | 10 190 | 1 968 | 940 | 598 | 426 | 118 | 26 |
0.60 | 171 126 | 34 104 | 16 976 | 3 274 | 1 562 | 990 | 706 | 192 | 38 |
0.70 | 323 228 | 64 410 | 32 058 | 6 176 | 2 940 | 1 862 | 1 322 | 352 | 62 |
0.80 | 770 020 | 153 422 | 76 348 | 14 688 | 6 980 | 4 412 | 3 128 | 814 | 126 |
0.90 | 3 251 102 | 647 690 | 322 264 | 61 924 | 29 380 | 18 534 | 13 110 | 3 336 | 450 |
1.10 | 3 593 120 | 715 666 | 355 984 | 68 240 | 32 272 | 20 282 | 14 288 | 3 496 | 292 |
1.20 | 941 030 | 187 410 | 93 208 | 17 846 | 8 426 | 5 286 | 3 716 | 890 | |
1.30 | 437 234 | 87 068 | 43 298 | 8 280 | 3 904 | 2 444 | 1 714 | 402 | |
1.40 | 256 630 | 51 098 | 25 406 | 4 854 | 2 284 | 1 428 | 1 000 | 228 | |
1.50 | 171 082 | 34 062 | 16 934 | 3 232 | 1 518 | 948 | 662 | 148 | |
1.60 | 123 556 | 24 596 | 12 226 | 2 330 | 1 094 | 680 | 474 | 104 | |
1.80 | 74 842 | 14 896 | 7 402 | 1 408 | 658 | 408 | 284 | 58 | |
2.00 | 51 318 | 10 212 | 5 074 | 962 | 448 | 278 | 192 | ||
3.00 | 17 102 | 3 400 | 1 688 | 316 | 146 | 88 | 60 | ||
4.00 | 9 498 | 1 886 | 934 | 174 | 78 | 46 | 30 | ||
5.00 | 6 419 | 1 272 | 630 | 116 | 52 | 30 | |||
10.00 | 2 318 | 458 | 226 | 40 | |||||
20.00 | 992 | 194 | 94 |
(Tables from Woodward, M. Epidemiology Study Design and Analysis. Boca Raton: Chapman and Hall, 2013)
Stop and Think!
- Incidence rate increase \((\pi)\)?
- Relative risk decreases \((\lambda)\)?
- How would you use this table to determine sample size for 'protective' effects (i.e., nutritional components or medical procedures which prevent a negative outcome), as opposed to an increased risk?
- What is the minimal detectable relative risk if you had funds for 1000 subjects?
- n decreases
- Largest n is closest to l
- Protective effects would be those with \(\lambda \lt 1\)
- With a background rate of 10/100 and 1000 subjects, a relative risk of about 1.65 could be detected.