3: Sampling
3: SamplingOverview
Case-Study: Effectiveness of PPH
Jaylen wants to study the effectiveness of PPH (postpartum hemorrhage) simulation on learning how to appropriately identify and use nursing interventions - and then apply the results to a population of OB staff nurses in the US. To complete this study, Jaylen needs to apply population and sampling concepts to his study. Jaylen has some decisions to make about how to conduct his study. He knows he is going to use OB staff nurses at a local hospital. He also has a pre and post survey to be administered before and after the nurses participate in simulation training. His first thought is to take a simple random sample of the nurses. However, Jaylen could divide the population into groups based on number of years of experience, and provide the simulation experience to a simple random sample from each group, thus using a stratified random sample. Alternatively he could just randomly select one team of nurses for a cluster sample. Jaylen needs to consider the advantages and disadvantages of each of these strategies, let’s see how we can help him decide.
Objectives
- Identify a population versus a sample
- Apply different techniques for drawing samples
- Identify experimental vs observational studies
- Identify problems in data collection that result in bias
3.1 - Samples & Populations
3.1 - Samples & PopulationsWhile Jaylen’s decision may not seem that important, he will still be working with nurses and conducting research, the idea of deciding on a sampling strategy has far reaching implications for both the statistics and the conclusions a researcher can draw from the results.
First let’s talk about populations and samples.
We, like Jaylen wanting to know more about ALL nurses, often have questions concerning large populations. Gathering information from the entire population is not always possible due to barriers such as time, accessibility, or cost, imagine trying to perform Jaylen’s study on every nurse across the world! Instead of gathering information from the whole population, we often gather information from a smaller subset of the population, known as a sample.
Values concerning a sample are referred to as sample statistics while values concerning a population are referred to as population parameters.
- Population
- The entire set of possible cases
- Sample
- A subset of the population from which data are collected
- Statistic
- A measure concerning a sample (e.g., sample mean)
- Parameter
- A measure concerning a population (e.g., population mean)
The process of using sample statistics to make conclusions about population parameters is known as inferential statistics. In other words, data from a sample are used to make an inference about a population.
- Inferential Statistics
- Statistical procedures that use data from an observed sample to make a conclusion about a population
3.2 - Sampling Bias
3.2 - Sampling BiasRecall the entire group of individuals of interest is called the population. It may be unrealistic or even impossible to gather data from the entire population. The subset of the population from which data are actually gathered is the sample. A sample should be selected from a population randomly, otherwise it may be prone to bias. Our goal is to obtain a sample that is representative of the population. In Jaylen’s case, he needs to obtain a sample of nurses that are representative to the population of nurses.
- Representative Sample
- A subset of the population from which data are collected that accurately reflects the population
If Jaylen fails to appropriately sample, he risks biasing his results due to sampling bias. Sampling bias occurs when there is a systematic favoring of certain outcomes due to the methods employed to obtain the sample.
Jaylen needs to be aware of three main sources of bias.
Types of Bias
- Non-response – large percentage of those sampled do not to respond or participate.
- Response – when study participants either do not respond truthfully or give answers they feel the researcher wants to hear. For example, when students are asked if they ever cheated on an exam even those who have would respond with "no".
- Selection – this bias occurs when the sample selected does not reflect the population of interest. For instance, you are interested in the attitude of female students regarding campus safety but when sampling you also include males. In this case your population of interest was female students however your sample included subject not in that population (i.e. males).
3.3 - Sampling Methods
3.3 - Sampling MethodsAs Jaylen has discovered, there are many different ways to select a sample from a population. Some of these methods are probability-based, such as the simple random, stratified, and cluster sampling methods that you'll read about below and in your textbook. Other sampling methods are not probability-based, such as convenience sampling methods which you'll read about below.
Simple Random Sampling
To prevent sampling bias and obtain a representative sample, a sample should be selected using a probability-based sampling design which gives each individual a known chance of being selected. The most common probability-based sampling method is the simple random sampling method.
Using this method, a sample is selected without replacement. This means that once an individual has been selected to be a part of the sample they cannot be selected a second time. If multiple samples are being taken, an individual can appear in more than one sample, but only once in each sample.
- Simple Random Sampling
- A method of obtaining a sample from a population in which every member of the population has an equal chance of being selected
- Stratified random sample
- Where you have first identified the population of interest, you then divide this population into strata or groups based on some characteristic. In Jaylen’s example he would be using years of experience (for example new nurse, mid-career, and experienced) as his strata. Once you define the strata you perform simple random sample from each strata.
- Cluster sample
- Where a random cluster of subjects is taken from the population of interest. For instance, Jaylen could obtain a list of the nursing teams at the hospital. Each team would be a cluster. Then he could randomly select one of the teams to use as his sample.
Convenience Sampling
While probability-based sampling methods are considered better because they can prevent sampling bias, there are times when it is not possible to use one of these methods. For example, a researcher may not have access to the entire population. In cases were probability-based sampling methods are not practical, convenience samples are often used.
- Convenience Sampling
- A method of obtaining a sample from a population by ease of accessibility; such a sample is not random and may not be representative of the intended population.
3.4 - Experimental and Observational Studies
3.4 - Experimental and Observational StudiesNow that Jaylen can weigh the different sampling strategies, he might want to consider the type of study he is conduction. As a note, for students interested in research designs, please consult STAT 503 for a much more in-depth discussion. However, for this example, we will simply distinguish between experimental and observational studies.
Now that we know how to collect data, the next step is to determine the type of study. The type of study will determine what type of relationship we can conclude.
There are predominantly two different types of studies:
- Observational
- A study where a researcher records or observes the observations or measurements without manipulating any variables. These studies show that there may be a relationship but not necessarily a cause and effect relationship.
- Experimental
- A study that involves some random assignment* of a treatment; researchers can draw cause and effect (or causal) conclusions. An experimental study may also be called a scientific study or an experiment.
Let's say that there is an option to take quizzes throughout this class. In an observational study, we may find that better students tend to take the quizzes and do better on exams. Consequently, we might conclude that there may be a relationship between quizzes and exam scores.
In an experimental study, we would randomly assign quizzes to specific students to look for improvements. In other words, we would look to see whether taking quizzes causes higher exam scores.
Causation
It is very important to distinguish between observational and experimental studies since one has to be very skeptical about drawing cause and effect conclusions using observational studies. The use of random assignment of treatments (i.e. what distinguishes an experimental study from an observational study) allows one to employ cause and effect conclusions.
Ethics is an important aspect of experimental design to keep in mind. For example, the original relationship between smoking and lung cancer was based on an observational study and not an assignment of smoking behavior.
3.6 - Summary
3.6 - SummaryCase-Study: Effectiveness of PPH
So Jaylen is really thinking ahead in terms of different possible strategies for his study. He is considering his sampling, and is planning on using an experimental design. What kinds of conclusions he may or may not be able to draw from his study? If he uses the stratified sample, then he is sure that he has nurses with different experience levels in his sample. He might be interested in these specific grouping and want to draw conclusions about the impact of the simulation education on their nursing skills. On the other hand, Jaylen may not be able to select individual nurses. He may have to offer the training to an entire group of nurses, making the cluster sample more practical. This strategy might limit his ability to make generalizations based on the characteristics he would have stratified.
There is no right or wrong answer to how to sample, researchers, particularly those in the social sciences, often have to work within certain limitations of their resources or access to participants. Sampling designs may be necessary given those limitations. However, given infinite resources, sampling can be a very complex process, but ensure that the groups of interest, along with all confounding variables are taken into consideration. Most research is typically somewhere in between. Importantly, this does not make this research any less valuable, as long as we apply our growing knowledge of statistics and understand the appropriate limitations we encounter.