Example Section
The baseline prevalence of smoking in a particular community is 30%. A clean indoor air policy goes into effect. What is the sample size required to detect a decrease in smoking prevalence of at least 2 percentage points, with an alpha of 0.05 and a power of 90%?
Formula Section
We are interested in testing the following hypothesis:
\(\begin{array}{l}
\mathrm{H}_{0}\colon \pi=\pi_{0} \\
\mathrm{H}_{1}\colon \pi=\pi_{1}=\pi_{0}+d
\end{array}\)
Where \(\pi\) is the true proportion, \(\pi_0\) is some specified value for the proportion we wish to test (30% in our example), and \(\pi_1\) (which differs from \(\pi_0\) by an amount d (d= 2% in our example)) is the alternative value.
The formula needed to calculate the sample size is:
\(\displaystyle{n=\frac{1}{d^{2}}\left[z_{\alpha} \sqrt{\pi_{0}\left(1-\pi_{0}\right)}+z_{\beta} \sqrt{\pi_{1}\left(1-\pi_{1}\right)}\right]^{2}}\)
Where
- \(\pi_0\) = null hypothesized proportion
- d = estimated change in proportion
Note that we can replace \(z_a\) by \(z_{\alpha / 2}\) for a two-sided test.
The z terms can be found from a standard normal distribution table, and common values are shown below:
Significance level | |||||||
---|---|---|---|---|---|---|---|
One-sided | Two-sided | Power | |||||
5% |
1% 2.3263 |
0.1% 3.0902 |
5% 1.9600 |
1% 2.5758 |
0.1% 3.2905 |
90% 1.2816 |
95% 1.6449 |
(Chapter 8.5, p 305, Woodward book)
The table below can also be used to estimate the necessary sample size:
These tables give requirements for a one-sided test directly. For two-sided tests, use the table corresponding to half the required significance level. Note that \(\pi_{0}\) is the hypothesized proportion (under \(H_{0}\)) and \(d\) is the difference to be tested. | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
(a) 5% significance, 90% power \(\pi_{0}\) |
|||||||||||
\(d\) | 0.01 | 0.10 | 0.20 | 0.30 | 0.40 | 0.50 | 0.60 | 0.70 | 0.80 | 0.90 | 0.95 |
0.01 | 1 178 | 8 001 | 13 923 | 18 130 | 20 625 | 21 406 | 20 475 | 17 830 | 13 473 | 7 400 | 3 717 |
0.02 | 366 | 2 070 | 3 534 | 4 567 | 5 172 | 5 349 | 5 097 | 4 417 | 3 308 | 1 769 | 833 |
0.03 | 192 | 950 | 1 593 | 2 045 | 2 305 | 2 376 | 2 255 | 1 944 | 1 443 | 748 | 322 |
0.04 | 123 | 551 | 908 | 1 158 | 1 300 | 1 335 | 1 262 | 1 083 | 795 | 398 | 148 |
0.05 | 88 | 362 | 589 | 746 | 834 | 853 | 804 | 686 | 498 | 239 | |
0.06 | 67 | 258 | 414 | 521 | 580 | 591 | 555 | 471 | 338 | 155 | |
0.07 | 54 | 194 | 308 | 385 | 427 | 434 | 405 | 342 | 242 | 104 | |
0.08 | 44 | 152 | 238 | 296 | 327 | 331 | 308 | 258 | 181 | 71 | |
0.09 | 38 | 123 | 190 | 235 | 259 | 261 | 242 | 201 | 139 | 48 | |
0.10 | 32 | 102 | 156 | 191 | 210 | 211 | 195 | 161 | 109 | ||
0.15 | 18 | 49 | 72 | 87 | 93 | 92 | 83 | 66 | 40 | ||
0.20 | 12 | 30 | 42 | 49 | 52 | 50 | 44 | 33 | |||
0.25 | 9 | 20 | 27 | 31 | 33 | 31 | 26 | 18 | |||
0.30 | 7 | 14 | 19 | 22 | 22 | 20 | 16 | ||||
0.35 | 5 | 11 | 14 | 16 | 16 | 14 | 10 | ||||
0.40 | 4 | 9 | 11 | 12 | 11 | 10 | |||||
0.45 | 4 | 7 | 8 | 9 | 8 | 6 | |||||
0.50 | 3 | 6 | 7 | 7 | 6 |
(Tables from Woodward, M. Epidemiology Study Design and Analysis. Boca Raton: Chapman and Hall:, 2013)
Stop and Think!
Looking at the table values, what happens to the necessary sample size as:
- Prevalence increases (\(B_0\))? Does the sample size increase or decrease?
- What happens to the sample size as effect size decreases?
- What is the minimal detectable difference if you had funds for 1,500 subjects?
- The largest sample sizes occur with baseline prevalence at 0.5
- The smaller the effect size, the larger the sample size
- About 3.6% decrease in prevalence