Lesson 4: Sampling Distributions

Overview Section

In inferential statistics, we want to use characteristics of the sample (i.e. a statistic) to estimate the characteristics of the population (i.e. a parameter).

In Lesson 3, we learned how to define events as random variables. By doing so, we can understand events mathematically by using probability functions, means, and standard deviations. All of this is important because it helps us reach our goal to be able to make inferences about the population based on the sample. But we need more.

If we obtain a random sample and calculate a sample statistic from that sample, the sample statistic is a random variable (wow!). The population parameters, however, are fixed. If the statistic is a random variable, can we find the distribution? The mean? The standard deviation?

The answer is yes! This is why we need to study the sampling distribution of statistics. So what is a sampling distribution?

Sampling Distribution: The sampling distribution of a statistic is a probability distribution based on a large number of samples of size \(n\) from a given population.

Consider this example. A large tank of fish from a hatchery is being delivered to the lake. We want to know the average length of the fish in the tank. Instead of measuring all of the fish, we randomly sample twenty fish and use the sample mean to estimate the population mean.

Denote the sample mean of the twenty fish as \(\bar{x}_1\). Suppose we take a separate sample of size twenty from the same hatchery. Denote that sample mean as \(\bar{x}_2\). Would \(\bar{x}_1\) equal \(\bar{x}_2\)? Not necessarily. What if we took another sample and found the mean? Consider now taking 1000 random samples of size twenty and recording all of the sample means. We could take the 1000 sample means and create a histogram. This would give us a picture of what the distribution of the sample means looks like. The distribution of all of these sample means is the sampling distribution of the sample mean.

We can find the sampling distribution of any sample statistic that would estimate a certain population parameter of interest. In this Lesson, we will focus on the sampling distributions for the sample mean, \(\bar{x}\), and the sample proportion, \(\hat{p}\).

We begin by describing the sampling distribution of the sample mean and then applying the central limit theorem. Last, we will discuss the sampling distribution of the sample proportion.

Objectives

Upon successful completion of this lesson, you should be able to:

Understand the meaning of sampling distribution.
Apply the central limit theorem to calculate approximate probabilities for sample means and sample proportions.
Describe the sampling distribution of the sample mean and proportion.
Identify situations in which the normal distribution and t-distribution may be used to approximate a sampling distribution.