4.3 - Introduction to Bootstrapping

In order to construct a confidence interval we need information about the sampling distribution. In Lesson 4.1 we saw how we could construct a sampling distribution when population values were known. What if population values are not known? This is usually the case. If we have sample data, then we can use bootstrapping methods to construct a bootstrap sampling distribution to construct a confidence interval.

Bootstrapping is a resampling procedure that uses data from one sample to generate a sampling distribution by repeatedly taking random samples from the known sample.

Bootstrapping
A resampling procedure for constructing a sampling distribution using data from a sample

Example: Bootstrap Distribution for Mean Height Section

We have data concerning the heights of individuals in a random sample of \(n=15\). To construct a bootstrap distribution for the mean height we would first randomly select one individual from that sample and record their height. Then, with the that individual placed back into the sample, we would randomly select a second individual and record their height. This is known as "sampling with replacement" because we are putting each case back into the sample after recording their height. We would repeat this process until we have selected 15 values. Because we are sampling with replacement, some individuals may appear in the bootstrap sample more than once. We would use those 15 selected values to compute a bootstrapped sample mean. This process is repeated many times. The distribution of many bootstrapped sample means is known as the bootstrap distribution or bootstrap sampling distribution. 

The following pages include additional video examples that use StatKey to demonstrate the construction of bootstrap sampling distributions.