4.2 - Sampling Distribution of the Sample Proportion

Before we begin, let’s make sure we review the terms and notation associated with proportions:

  • \(p\) is the population proportion. It is a fixed value.

  • \(n\) is the size of the random sample.

  • \(\hat{p}\) is the sample proportion. It varies based on the sample.

Let's look at some of the runners in Ellie's sample to illustrate how to find the sampling distribution for an example where the population is small.

The five runners are Alex (A),Betina(B), Carly (C), Debbie (D), and Edward (E). The table below shows each runner's name and their favorite color running shoe.

Name

Alex (A)

Betina(B)

Carly (C)

Debbie (D)

Edward (E)

Color

Green

Blue

Yellow

Purple

Blue

We are interested in the proportion of runners who prefer blue shoes, and from the table, we can see that\(p = .40\) of the runners prefer blue shoes.

Similar to the runner's mileage example earlier in the lesson, let's say we didn't know the proportion of runners who like blue as their favorite shoe color. We'll use resampling methods to estimate the proportion. Let’s take \(n=2\) repeated samples, taken without replacement. Here are all the possible samples of size \(n=2\) and their respective probabilities of the proportion of runners who like blue running shoes.

Sample

P(Blue)

Probability

AB

1/2

1/10

AC

0

1/10

AD

0

1/10

AE

1/2

1/10

BC

1/2

1/10

BD

1/2

1/10

BE

1

1/10

CD

0

1/10

CE

1/2

1/10

DE

1/2

1/10

The probability mass function (PMF) is:

P(Blue)

0

1/2

1

Probability

3/10

6/10

1/10

The graph of the PMF:

Sampling Distribution of P(Blue)

Bar graph showing three bars (0 with a length of 0.3, 0.5 with length of 0.5 and 1 with a lenght of 0.1).

0.0 0.1 0.2 0.3 0.4 0.5 1 0.5 0 0.0 0.2 0.4 0.6 0.8 1.0 0.6

The true proportion is \(p=P(Blue)=\frac{2}{5}\). When the sample size is \(n=2\), you can see from the PMF, it is not possible to get a sampling proportion that is equal to the true proportion.

Although not presented in detail here, we could find the sampling distribution for a larger sample size, say \(n=4\). The PMF for n=4 is...

P(Blue)

1/4

1/2

Probability

2/5

3/5

As with the sampling distribution of the sample mean, the sampling distribution of the sample proportion will have sampling error. It is also the case that the larger the sample size, the smaller the spread of the distribution.