1.2.1 - Sampling Bias

Recall the entire group of individuals of interest is called the population. It may be unrealistic or even impossible to gather data from the entire population. The subset of the population from which data are actually gathered is the sample. A sample should be selected from a population randomly, otherwise it may be prone to bias. Our goal is to obtain a sample that is representative of the population.

Representative Sample
A subset of the population from which data are collected that accurately reflects the population
Bias
The systematic favoring of certain outcomes
Sampling Bias
Systematic favoring of certain outcomes due to the methods employed to obtain the sample

Example: Weight Loss Study Volunteers Section

A medical research center is testing a new weight loss treatment. They advertise on a social media site that they are looking for volunteers to participate. There is sampling bias because the sample will be limited to people who use the social media site where they advertised. The individuals who choose to participate may be different from the overall population. For example, volunteers may be individuals who are already actively trying to lose weight. This is not a representative sample because the sample may have characteristics that are different from the population of interest.

Example: NYC Advertising Study Section

The marketing department for a large retail chain wants to survey their customers about a new advertising plan. They go into one of their largest New York City stores on a Tuesday morning and survey the first 50 people who make a purchase. There is sampling bias for a number of reasons. They are only sampling at one store, in New York City; there may be differences between the customers at this store and those that shop at their other locations. By conducting their survey on a Tuesday morning they are limiting themselves to individuals who are out shopping at that time; the sample may lack people who work during the day. Finally, they only survey people who make a purchase; individuals who do not make a purchase, perhaps because they are not satisfied with the store, will not be included in their sample. This is not a representative sample because the sample selected may be different from the population of interest.