# 9.1.1 - Confidence Intervals

9.1.1 - Confidence IntervalsGiven that \(np \ge 10\) and \(n(1-p) \ge 10\) for both groups, in other words at least 10 "successes" and at least 10 "failures" in each group, the sampling distribution can be approximated using the normal distribution with a mean of \(\widehat p_1 - \widehat p_2\) and a standard error of \(\sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}}\).

Recall from the general form of a confidence interval:

- General Form of a Confidence Interval
- \(sample\ statistic\pm(multiplier)\ (standard\ error)\)

Here, the sample statistic is the difference between the two proportions (\(\widehat p_1 - \widehat p_2\)) and the standard error is computed using the formula \(\sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}}\). Putting this information together, we can derive the formula for a confidence interval for the difference between two proportions.

- Confidence Interval for the Difference Between Two Proportions
- \((\widehat{p}_1-\widehat{p}_2) \pm z^\ast {\sqrt{\frac{\widehat{p}_1 (1-\widehat{p}_1)}{n_1}+\frac{\widehat{p}_2 (1-\widehat{p}_2)}{n_2}}}\)

## Example: Confidence Interval for the Difference Between the Proportion of Women and Men in Favor of Legal Same Sex Marriage

A survey was given to a sample of college students. They were asked whether they think same sex marriage should be legal. Of the 251 women in the sample, 185 said "yes." Of the 199 men in the sample, 107 said "yes." Let’s construct a 95% confidence interval for the difference of the proportion of women and men who responded “yes.” We can apply the 95% Rule and use a multiplier of \(z^\ast\) = 2

- For women: \(np=185\) and \(n(1-p) = 251-185=66\)
- For men: \(np=107\) and \(n(1-p)=199-107=92\)

These counts are all at least 10 so the sampling distribution can be approximated using a normal distribution.

\(\widehat{p}_{w}=\frac{185}{251}=0.737\)

\(\widehat{p}_{m}=\frac{107}{199}=0.538\)

\((0.737-0.538) \pm 2 {\sqrt{\frac{0.737 (1-0.737)}{251}+\frac{0.538 (1-0.538)}{199}}}\)

\(0.199 \pm 2 ( 0.045)\)

\(.199 \pm .090=[.110, .288]\)

We are 95% confident that in the population the difference between the proportion of women and men who are in favor of same sex marriage legalization is between 0.110 and 0.288.

This confidence interval does not contain 0. Therefore it is not likely that the difference between women and men is 0. We can conclude that there is a difference between the proportion of women and men in the population who would respond “yes" to this question.

# 9.1.1.1 - Minitab Express: Confidence Interval for 2 Proportions

9.1.1.1 - Minitab Express: Confidence Interval for 2 ProportionsMinitab Express can be used to construct a confidence interval for the difference between two proportions using the normal approximation method. Note that the confidence intervals given in the Minitab Express output assume that \(np \ge 10\) and \(n(1-p) \ge 10\) for both groups. If this assumption is not true, you should use bootstrapping methods in StatKey.

## MinitabExpress – Constructing a Confidence Interval with Raw Data

Let's estimate the difference between the proportion of females who have tried weed and the proportion of males who have tried weed.

- Open Minitab Express file:
- On a
**PC**: In the menu bar select**STATISTICS > Two Samples > Proportions** - On a
**Mac**:**Statistics > 2-Sample Inference > Proportions** - Double click the variable
*Try Weed*in the box on the left to insert the variable into the*Samples*box - Double click the variable
*Biological Sex*in the box on the left to insert the variable into the*Sample IDs*box - Keep the default
*Options* - Click OK

This should result in the following output:

Event: Try Weed = Yes |

\(p_1\): proportion where Try Weed = Yes and Biological Sex = Female |

\(p_2\): proportion where Try Weed = Yes and Biological Sex = Male |

Difference: \(p_1-p_2\) |

Biological Sex | N | Event | Sample p |
---|---|---|---|

Female | 127 | 56 | 0.440945 |

Male | 99 | 62 | 0.626263 |

Difference | 95% CI for Difference |
---|---|

-0.185318 | (-0.313920, -0.056716) |

Null hypothesis | \(H_0\): \(p_1-p_2=0\) |
---|---|

Alternative hypothesis | \(H_1\): \(p_1-p_2\neq0\) |

Method | Z-Value | P-Value |
---|---|---|

Fisher's exact | 0.0072 | |

Normal approximation | -2.82 | 0.0047 |

Select your operating system below to see a step-by-step guide for this example.

## MinitabExpress – Constructing a Confidence Interval with Summarized Data

Let's estimate the difference between the proportion of Penn State World Campus graduate students who have children to the proportion of Penn State University Park graduate students who have children. In our representative sample there were 120 World Campus graduate students; 92 had children. There were 160 University Park graduate students; 23 had children.

- Open Minitab Express without data
- On a
**PC**: In the menu bar select**STATISTICS > Two Samples > Proportions** - On a
**Mac**:**Statistics > 2-Sample Inference > Proportions** - Change
*Both samples are in one column*to*Summarized data* - For
*Sample 1*next to*Number of events*enter*92*and next to*Number of trials*enter*120* - For
*Sample 2*next to*Number of events*enter*23*and next to*Number of trials*enter 160 - Keep the default
*Options* - Click OK

This should result in the following output:

\(p_1\): proportion where Sample 1 = Event |

\(p_2\): proportion where Sample 2 = Event |

Difference: \(p_1-p_2\) |

Sample | N | Event | Sample p |
---|---|---|---|

Sample 1 | 120 | 92 | 0.766667 |

Sample 2 | 160 | 23 | 0.143750 |

Difference | 95% CI for Difference |
---|---|

0.622917 | (0.529740, 0.716093) |

Null hypothesis | \(H_0\): \(p_1-p_2=0\) |
---|---|

Alternative hypothesis | \(H_1\): \(p_1-p_2\neq0\) |

Method | Z-Value | P-Value |
---|---|---|

Fisher's exact | <0.0001 | |

Normal approximation | 13.10 | <0.0001 |

Select your operating system below to see a step-by-step guide for this example.