9.3 - Example 9-1: Population-based cohort or a cross-sectional studies

Example 9-1 Section

Suppose you are interested in the question: "Does one group have a prevalence percentage that is different than other groups?" For example:

Baseline prevalence of smoking in a particular community is 30%. A clean indoor air policy goes into effect. What is the sample size required to detect a decrease in smoking prevalence of at least 2 percentage points? \(\alpha=0.05\); 90% power.

We are interested in testing the following hypothesis:

Null hypothesis:

\(H_0\colon \text{prevalence}_{(Before)}\le \text{prevalence}_{(After)}\)

Alternative hypothesis:

\(H_A\colon \text{prevalence}_{(Before)}- \text{prevalence}_{(After)}=\delta\)

Where \(\delta \gt 0\)

The resulting formula for the sample size for testing a difference in prevalence using a one-sided test is as follows:

and for this example, n can be calculated as:

\(n=\dfrac{1}{d^{2}}\left [ z_{\alpha }\sqrt{\pi_{0}(1-\pi_{0})}+z_{\beta }\sqrt{\pi_{1}(1-\pi_{1})} \right ]^{2}\)

Replace \(z_{\alpha }\) by \(z_{\alpha/2 }\) for a two-sided test

Take a moment to look at the table below for sample size requirements for testing the value of a single proportion with a one-sided test. Prevalence can be found along the top of the table and the percentage point difference vertically on the left. How many individuals do we need to include in our study in order to meet the above criteria?

(Tables from Woodward, M. Epidemiology Study Design and Analysis. Boca Raton: Chapman and Hall:, 1999 )

 

Table B.8. Sample size requirements for testing the value of a single proportion

These tables give requirements for a one-sided test directly. For two-sided tests, use the table corresponding to half the required significance level. Note that \(\pi_{0}\) is the hypothesized proportion (under \(H_{0}\)) and \(d\) is the difference to be tested.

(a) 5% significance, 90% power

\(\pi_{0}\)

\(d\) 0.01 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95
0.01 1 178 8 001 13 923 18 130 20 625 21 406 20 475 17 830 13 473 7 400 3 717
0.02 366 2 070 3 534 4 567 5 172 5 349 5 097 4 417 3 308 1 769 833
0.03 192 950 1 593 2 045 2 305 2 376 2 255 1 944 1 443 748 322
0.04 123 551 908 1 158 1 300 1 335 1 262 1 083 795 398 148
0.05 88 362 589 746 834 853 804 686 498 239  
0.06 67 258 414 521 580 591 555 471 338 155  
0.07 54 194 308 385 427 434 405 342 242 104  
0.08 44 152 238 296 327 331 308 258 181 71  
0.09 38 123 190 235 259 261 242 201 139 48  
0.10 32 102 156 191 210 211 195 161 109    
0.15 18 49 72 87 93 92 83 66 40    
0.20 12 30 42 49 52 50 44 33      
0.25 9 20 27 31 33 31 26 18      
0.30 7 14 19 22 22 20 16        
0.35 5 11 14 16 16 14 10        
0.40 4 9 11 12 11 10          
0.45 4 7 8 9 8 6          
0.50 3 6 7 7 6            

 

Table B.8. Sample size requirements for testing the value of a single proportion

These tables give requirements for a one-sided test directly. For two-sided tests, use the table corresponding to half the required significance level. Note that \(\pi_{0}\) is the hypothesized proportion (under \(H_{0}\)) and \(d\) is the difference to be tested.

(a) 5% significance, 90% power

\(\pi_{0}\)

\(d\) 0.01 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95
0.01 1 178 8 001 13 923 18 130 20 625 21 406 20 475 17 830 13 473 7 400 3 717
0.02 366 2 070 3 534 4 567 5 172 5 349 5 097 4 417 3 308 1 769 833
0.03 192 950 1 593 2 045 2 305 2 376 2 255 1 944 1 443 748 322
0.04 123 551 908 1 158 1 300 1 335 1 262 1 083 795 398 148
0.05 88 362 589 746 834 853 804 686 498 239  
0.06 67 258 414 521 580 591 555 471 338 155  
0.07 54 194 308 385 427 434 405 342 242 104  
0.08 44 152 238 296 327 331 308 258 181 71  
0.09 38 123 190 235 259 261 242 201 139 48  
0.10 32 102 156 191 210 211 195 161 109    
0.15 18 49 72 87 93 92 83 66 40    
0.20 12 30 42 49 52 50 44 33      
0.25 9 20 27 31 33 31 26 18      
0.30 7 14 19 22 22 20 16        
0.35 5 11 14 16 16 14 10        
0.40 4 9 11 12 11 10          
0.45 4 7 8 9 8 6          
0.50 3 6 7 7 6            

Try It! Section

Looking at the table values, what happens to the necessary sample size as:
  1. Prevalence increases (\(B_0\))? Does the sample size increase or decrease?
  2. What happens to the sample size as effect size decreases?
  3. What is the minimal detectable difference if you had funds for 1,500 subjects?
  1. The largest sample sizes occur with baseline prevalence at 0.5
  2. The smaller the effect size, the larger the sample size
  3. About 3.6% decrease in prevalence