Example 9-3 Section
Suppose your study design is an unmatched case-control study with equal numbers of cases and controls.
If 30% of the population is exposed to a risk factor, what is the number of study subjects (assuming an equal number of cases and controls in an unmatched study design) necessary to detect a hypothesized odds ratio of 2.0? Assume 90% power \(\alpha=0.05\).
Here are the hypotheses being tested:
Null hypothesis
\(H_0\colon \text{incidence}_{1}^* \le \text{incidence}_{2}^*\)
Alternative hypothesis
\(H_A\colon \text{incidence}_{1}^* / \text{incidence}_{2}^*=\lambda^*\)
where:
\(\lambda^*\gt0\)
\(\text{Disease incidence}_1^*=p(\text{Exposed|Case})\)
\(\text{Disease incidence}_2^*=p(\text{Not Exposed|Control})\)
The resulting sample size formula is:
\(n=\dfrac{(r+1)(1+(\lambda -1)P)^{2}}{rP^{2}(P-1)^{2}(\lambda -1)P)^{2}}\left [ z_{\alpha}\sqrt{(r+1)p_{c}^{*}(1-p_{c}^{*})} + z_{\beta}\sqrt{\frac{\lambda P(1-P)}{\left [ 1+(\lambda-1)P \right ]^{2}}+rP(1-P)} \right ]^{2}\)
where:
\(p_{c}^{*}=\dfrac{P}{r+1}\left ( \dfrac{r\lambda}{1+(\lambda -1)P}+1 \right )\)
Table B.10. Total sample size requirements (for the two groups combined) for unmatched case-control studies with equal numbers of cases and controls with equal numbers in each group
These tables give requirements for a one-sided test directly. For two-sided tests, use the table corresponding to half the required significance level. Note that \(P\) is the prevalence of the risk factor in the entire population and \(\lambda\) is the appropriate relative risk to be tested. | |||||||||
---|---|---|---|---|---|---|---|---|---|
(a) 5% significance, 90% power \(P\) |
|||||||||
\(\lambda\) | 0.010 | 0.050 | 0.100 | 0.200 | 0.300 | 0.400 | 0.500 | 0.700 | 0.900 |
0.10 | 2 318 | 456 | 224 | 108 | 70 | 50 | 40 | 30 | 38 |
0.20 | 3 206 | 638 | 316 | 158 | 104 | 80 | 66 | 56 | 88 |
0.30 | 4 546 | 912 | 458 | 232 | 160 | 124 | 106 | 98 | 176 |
0.40 | 6 676 | 1 348 | 684 | 356 | 248 | 200 | 176 | 172 | 330 |
0.50 | 10 318 | 2 098 | 1 074 | 566 | 404 | 332 | 296 | 306 | 616 |
0.60 | 17 220 | 3 522 | 1 816 | 974 | 706 | 588 | 536 | 576 | 1 206 |
0.70 | 32 570 | 6 698 | 3 476 | 1 890 | 1 390 | 1 174 | 1 088 | 1 206 | 2 612 |
0.80 | 77 686 | 16 052 | 8 382 | 4 614 | 3 438 | 2 944 | 2 764 | 3 146 | 7 012 |
0.90 | 328 374 | 68 156 | 35 786 | 19 922 | 15 020 | 13 006 | 12 354 | 14 400 | 32 892 |
1.10 | 363 666 | 76 090 | 40 352 | 22 918 | 17 630 | 15 574 | 15 096 | 18 316 | 43 550 |
1.20 | 95 332 | 20 020 | 10 664 | 6 112 | 4 744 | 4 228 | 4 134 | 5 102 | 12 340 |
1.30 | 44 334 | 9 342 | 4 998 | 2 888 | 2 260 | 2 032 | 2 002 | 2 510 | 6 166 |
1.40 | 26 044 | 5 506 | 2 958 | 1 722 | 1 358 | 1 230 | 1 222 | 1 554 | 3 870 |
1.50 | 17 376 | 3 684 | 1 986 | 1 166 | 926 | 846 | 846 | 1 090 | 2 748 |
1.60 | 12 558 | 2 672 | 1 446 | 854 | 684 | 628 | 632 | 826 | 2 106 |
1.80 | 7 618 | 1 630 | 888 | 532 | 432 | 400 | 408 | 546 | 1 420 |
2.00 | 5 230 | 1 124 | 616 | 374 | 306 | 288 | 296 | 404 | 1 074 |
3.00 | 1 754 | 386 | 218 | 138 | 120 | 118 | 126 | 184 | 522 |
4.00 | 978 | 220 | 126 | 84 | 74 | 76 | 84 | 130 | 380 |
5.00 | 664 | 150 | 88 | 60 | 56 | 58 | 66 | 104 | 316 |
10.00 | 244 | 60 | 38 | 30 | 30 | 34 | 40 | 70 | 224 |
20.00 | 108 | 30 | 20 | 18 | 20 | 24 | 30 | 56 | 190 |
Table B.10. Total sample size requirements (for the two groups combined) for unmatched case-control studies with equal numbers of cases and controls with equal numbers in each group
These tables give requirements for a one-sided test directly. For two-sided tests, use the table corresponding to half the required significance level. Note that \(P\) is the prevalence of the risk factor in the entire population and \(\lambda\) is the appropriate relative risk to be tested. | |||||||||
---|---|---|---|---|---|---|---|---|---|
(a) 5% significance, 90% power \(P\) |
|||||||||
\(\lambda\) | 0.010 | 0.050 | 0.100 | 0.200 | 0.300 | 0.400 | 0.500 | 0.700 | 0.900 |
0.10 | 2 318 | 456 | 224 | 108 | 70 | 50 | 40 | 30 | 38 |
0.20 | 3 206 | 638 | 316 | 158 | 104 | 80 | 66 | 56 | 88 |
0.30 | 4 546 | 912 | 458 | 232 | 160 | 124 | 106 | 98 | 176 |
0.40 | 6 676 | 1 348 | 684 | 356 | 248 | 200 | 176 | 172 | 330 |
0.50 | 10 318 | 2 098 | 1 074 | 566 | 404 | 332 | 296 | 306 | 616 |
0.60 | 17 220 | 3 522 | 1 816 | 974 | 706 | 588 | 536 | 576 | 1 206 |
0.70 | 32 570 | 6 698 | 3 476 | 1 890 | 1 390 | 1 174 | 1 088 | 1 206 | 2 612 |
0.80 | 77 686 | 16 052 | 8 382 | 4 614 | 3 438 | 2 944 | 2 764 | 3 146 | 7 012 |
0.90 | 328 374 | 68 156 | 35 786 | 19 922 | 15 020 | 13 006 | 12 354 | 14 400 | 32 892 |
1.10 | 363 666 | 76 090 | 40 352 | 22 918 | 17 630 | 15 574 | 15 096 | 18 316 | 43 550 |
1.20 | 95 332 | 20 020 | 10 664 | 6 112 | 4 744 | 4 228 | 4 134 | 5 102 | 12 340 |
1.30 | 44 334 | 9 342 | 4 998 | 2 888 | 2 260 | 2 032 | 2 002 | 2 510 | 6 166 |
1.40 | 26 044 | 5 506 | 2 958 | 1 722 | 1 358 | 1 230 | 1 222 | 1 554 | 3 870 |
1.50 | 17 376 | 3 684 | 1 986 | 1 166 | 926 | 846 | 846 | 1 090 | 2 748 |
1.60 | 12 558 | 2 672 | 1 446 | 854 | 684 | 628 | 632 | 826 | 2 106 |
1.80 | 7 618 | 1 630 | 888 | 532 | 432 | 400 | 408 | 546 | 1 420 |
2.00 | 5 230 | 1 124 | 616 | 374 | 306 | 288 | 296 | 404 | 1 074 |
3.00 | 1 754 | 386 | 218 | 138 | 120 | 118 | 126 | 184 | 522 |
4.00 | 978 | 220 | 126 | 84 | 74 | 76 | 84 | 130 | 380 |
5.00 | 664 | 150 | 88 | 60 | 56 | 58 | 66 | 104 | 316 |
10.00 | 244 | 60 | 38 | 30 | 30 | 34 | 40 | 70 | 224 |
20.00 | 108 | 30 | 20 | 18 | 20 | 24 | 30 | 56 | 190 |
Try it!
- Prevalence of the risk factor increases (P)?
- Odds ratio decreases (\(\lambda\))?
- For many \(\lambda\), 0.5 has the smallest sample size requirement
- largest sample sizes with OR closest to 1; 1.1 requires greater n than 0.9
We have considered three typical epidemiologic research designs. You might also ask these questions:
Should the number of controls match the number of cases? Should multiple controls be used for each case?
Observe the power curve below:
Power increases but at a decreasing rate as the ratio of controls/cases increases. Little additional power is gained at ratios higher than four controls/cases. There is little benefit to enrolling a greater ratio of controls to cases.
from Woodward, M. Epidemiology Study Design and Analysis. Boca Raton: Chapman and Hall, 1999, p.265
Under what circumstances would it be recommended to enroll a large number of controls compared to cases?
Perhaps the small gain in power is worthwhile if the cost of a Type II error is large and the expense of obtaining controls is minimal, such as selecting controls with covariate information from a computerized database. If you must physically locate and recruit the controls, set up clinic appointments, run diagnostic tests, and enter data, the effort of pursuing a large number of controls quickly offsets any gain. You would use a one-to-one or two-to-one range. The bottom line is there is little additional power beyond a four-to-one ratio.
What if there is a Limited Number of Total Subjects for Case-Control Studies?
Sometimes the total number of subjects is limited (e.g., you have limited funds and the cost associated with each case is equal to the cost associated with a control). This graph illustrates power as related to the ratio of the controls to cases.
from Woodward, M. Epidemiology Study Design and Analysis. Boca Raton: Chapman and Hall, 1999, p.358
Try it!
There is maximum power with a one-to-one ratio of controls to cases. If you are limited in the number of people that can be enrolled in a study, match cases to controls in a one-to-one fashion.
What about Matched Case-Control Studies?
In matched case/control study designs, useful data come from only the discordant pairs of subjects. Useful information does not come from the concordant pairs of subjects. Matching of cases and controls on a confounding factor (e.g., age, sex) may increase the efficiency of a case-control study, especially when the moderator's minimal number of controls are rejected.
The sample size for matched study designs may be greater or less than the sample size required for similar unmatched designs because only the pairs discordant on exposure are included in the analysis. The proportion of discordant pairs must be estimated to derive sample size and power. The power of matched case/control study design for a given sample size may be larger or smaller than the power for an unmatched design.
Formula for sample size calculation for matched case-control study:
\(n=\dfrac{(r+1)(1+(\lambda -1)P)^{2}}{rP^{2}(P-1)^{2}(\lambda -1)^{2}}\left [ z_{\alpha}\sqrt{(r+1)p_{c}^{*}} + z_{\beta}\sqrt{\frac{\lambda P(1-P)}{\left [ 1+(\lambda-1)P \right ]^{2}}+rP(1-P)} \right ]^{2}\)
Where:
\(p_{c}^{*}=\dfrac{P}{r+1}\left ( \dfrac{r\lambda}{1+(\lambda -1)P}+1 \right )\)
P = prevalence of exposure among the population
\(\lambda\) = estimated relative risk
r = ratio of cases to controls