Lesson 1: Collecting Data

Lesson 1: Collecting Data

Objectives

Upon successful completion of this lesson, you will be able to:

  • Identify cases and variables in a research study
  • Classify variables as categorical or quantitative
  • Identify explanatory and response variables in a research study
  • Distinguish between a sample and a population
  • Determine whether a given sample is representative of the intended population 
  • Identify simple random sampling and convenience sampling methods
  • Use Minitab Express to draw a simple random sample from a known population
  • Identify potential non-response and response bias
  • Distinguish between experimental and observational designs
  • Identify confounding variables
  • Identify randomized experiments
  • Determine when causal conclusions (as opposed to associations) can be made
  • Classify samples as being independent or paired
  • Identify control groups, placebos, and blinding in research studies and explain why each is used

In this lesson, you will learn about how data are collected. You will be introduced to the terminology that will be used throughout the course. At the end of this lesson, there are flash cards that you can use to review these terms. You may also want to make your own flash cards by hand to review these terms throughout the semester. 


1.1 - Cases & Variables

1.1 - Cases & Variables

When conducting a research study, information is collected concerning cases. Cases are also sometimes known as units or experimental units.

variable is a characteristic that is measured and can take on different values. In other words, something that can vary. This is in contrast to a constant which is the same for all cases in a study.

Case
An experimental unit from which data are collected
Variable
Characteristic of cases that can take on different values (in other words, something that can vary)

Example: Student Data

Data are collected from a sample of STAT 200 students. Each student's major, quiz score, and lab assignment score is recorded.

The cases are the STAT 200 students. There are three variables: major, quiz score, and lab assignment score.

Example: Study Time & Grades

A third grade teacher wants to know if students who spend more time studying at home get higher homework and exam grades.

The third grade students are the cases. There are numerous variables: the amount of time spent studying at home, the homework grades, and the exam grades. 

Example: Dog Food

A researcher wants to know if dogs who are fed only canned food have different body mass indexes (BMI) than dogs who are fed only hard food. They collect BMI data from 50 dogs who eat only canned food and 50 dogs who eat only hard food.

The cases are the dogs. There are two variables: type of food and BMI.

Example: Chocolate Chip Cookie

Research question: What is the average weight of a chocolate chip cookie?

The cases are the cookies. The variable is weight. 


1.1.1 - Categorical & Quantitative Variables

1.1.1 - Categorical & Quantitative Variables

Variables can be classified as categorical or quantitative. Categorical variables are those that provide groupings that may have no logical order, or a logical order with inconsistent differences between groups (e.g., the difference between 1st place and 2 second place in a race is not equivalent to the difference between 3rd place and 4th place). Quantitative variables have numerical values with consistent intervals. 

Categorical variable
Names or labels (i.e., categories) with no logical order or with a logical order but inconsistent differences between groups (e.g., rankings), also known as qualitative.
Quantitative variable
Numerical values with magnitudes that can be placed in a meaningful order with consistent intervals, also known as numerical.

Example: Weight

A team of medical researchers weigh participants in kilograms. Weight in kilograms is a quantitative variable because it takes on numerical values with meaningful magnitudes and equal intervals.

Example: Favorite Ice Cream Flavor

A teacher conducts a poll in her class. She asks her students if they would prefer chocolate, vanilla, or strawberry ice cream at their class party. Preferred ice cream flavor is a categorical variable because the different flavors are categories with no meaningful order of magnitudes. 

Example: Birth Location

A survey asks “On which continent were you born?” This is a categorical variable because the different continents represent categories without a meaningful order of magnitudes.

Example: Children per Household

A census asks every household in a city how many children under the age of 18 reside there. Number of children in a household is a quantitative variable because it has a numerical value with a meaningful order and equal intervals.

Example: Highway Mile Markers

When a car breaks down on the highway, the emergency dispatcher may ask for the nearest mile marker. Highway mile marker value is a quantitative variable because it is numeric with a meaningful order of magnitudes and equal intervals. 

Example: Running Distance

A runner records the distance he runs each day in miles. Distance in miles is a quantitative variable because it takes on numerical values with meaningful magnitudes and equal intervals. 

Example: Highest Level of Education

A census asks residents for the highest level of education they have obtained: less than high school, high school, 2-year degree, 4-year degree, master's degree, doctoral/professional degree. This is a categorical variable. While there is a meaningful order of educational attainment, the differences between each category are not consistent. For example, the difference between high school and 2-year degree is not the same as the difference between a master's degree and a doctoral/professional degree. Because there are not equal intervals, this variable cannot be classified as quantitative. 

Example: Online Courses Taught

A survey designed for online instructors asks, "How many online courses have you taught?" Three options are given: "none," "some," or "many." While there is a meaningful order of magnitudes, there are not equal intervals. This is a categorical variable.

If the survey had asked, "How many online courses have you taught? Enter a number." this would be a quantitative variable. Here, participants are answering with the number of online courses they have taught. This is a numerical value with a meaningful order of magnitudes and equal intervals. 


1.1.2 - Explanatory & Response Variables

1.1.2 - Explanatory & Response Variables

In some research studies one variable is used to predict or explain differences in another variable. In those cases, the explanatory variable is used to predict or explain differences in the response variable. In an experimental study, the explanatory variable is the variable that is manipulated by the researcher. 

Explanatory Variable

Also known as the independent or predictor variable, it explains variations in the response variable; in an experimental study, it is manipulated by the researcher

Response Variable

Also known as the dependent or outcome variable, its value is predicted or its variation is explained by the explanatory variable; in an experimental study, this is the outcome that is measured following manipulation of the explanatory variable

Example: Panda Fertility Treatments

A team of veterinarians wants to compare the effectiveness of two fertility treatments for pandas in captivity. The two treatments are in-vitro fertilization and male fertility medications. This experiment has one explanatory variable: type of fertility treatment. The response variable is a measure of fertility rate.

Example: Public Speaking Approaches

A public speaking teacher has developed a new lesson that she believes decreases student anxiety in public speaking situations more than the old lesson. She designs an experiment to test if her new lesson works better than the old lesson. Public speaking students are randomly assigned to receive either the new or old lesson; their anxiety levels during a variety of public speaking experiences are measured.  This experiment has one explanatory variable: the lesson received. The response variable is anxiety level.

Example: Coffee Bean Origin

A researcher believes that the origin of the beans used to make a cup of coffee affects hyperactivity. He wants to compare coffee from three different regions: Africa, South America, and Mexico. The explanatory variable is the origin of coffee bean; this has three levels: Africa, South America, and Mexico. The response variable is hyperactivity level.

Example: Height & Age

A group of middle school students wants to know if they can use height to predict age. They take a random sample of 50 people at their school, both students and teachers, and record each individual's height and age. This is an observational study. The students want to use height to predict age so the explanatory variable is height and the response variable is age.

Example: Grade & Height

Research question: Do fourth graders tend to be taller than third graders?

This is an observational study. The researcher wants to use grade level to explain differences in height. The explanatory variable is grade level. The response variable is height. 


1.2 - Samples & Populations

1.2 - Samples & Populations

We often have questions concerning large populations. Gathering information from the entire population is not always possible due to barriers such as time, accessibility, or cost. Instead of gathering information from the whole population, we often gather information from a smaller subset of the population, known as a sample.

Values concerning a sample are referred to as sample statistics while values concerning a population are referred to as population parameters.

Population
The entire set of possible cases
Sample
A subset of the population from which data are collected
Statistic
A measure concerning a sample (e.g., sample mean)
Parameter
A measure concerning a population (e.g., population mean)

The process of using sample statistics to make conclusions about population parameters is known as inferential statistics. In other words, data from a sample are used to make an inference about a population.

Inference about a population

Inferential Statistics
Statistical procedures that use data from an observed sample to make a conclusion about a population

Example: Student Housing

A survey is carried out at Penn State Altoona to estimate the proportion of all undergraduate students living at home during the current term. Of the 3,838 undergraduate students enrolled at the campus, a random sample of 100 was surveyed.

  • Population: All 3,838 undergraduate students at Penn State Altoona
  • Sample: The 100 undergraduate students surveyed

We can use the data collected from the sample of 100 students to make inferences about the population of all 3,838 students.

Example: Polling Teachers

Science Teacher

Educational policy researchers randomly selected 400 teachers at random from the National Science Teachers Association database of members and asked them whether or not they believed that evolution should be taught in public schools. They received responses from 252 teachers.

  • Population: All National Science Teachers Association members
  • Sample: The 252 respondents

The researchers can use the data collected from the 252 teachers who responded to the survey to make inferences about the population of all National Science Teachers Association members.

Example: Flipping a Coin

A fair coin is flipped 500 times and the number of heads is recorded.

  • Population: All flips of this coin
  • Sample: The 500 flips recorded in this study

We can use data from these 500 flips to make inferences about the population of all flips of this coin.


1.2.1 - Sampling Bias

1.2.1 - Sampling Bias

Recall the entire group of individuals of interest is called the population. It may be unrealistic or even impossible to gather data from the entire population. The subset of the population from which data are actually gathered is the sample. A sample should be selected from a population randomly, otherwise it may be prone to bias. Our goal is to obtain a sample that is representative of the population.

Representative Sample
A subset of the population from which data are collected that accurately reflects the population
Bias
The systematic favoring of certain outcomes
Sampling Bias
Systematic favoring of certain outcomes due to the methods employed to obtain the sample

Example: Weight Loss Study Volunteers

A medical research center is testing a new weight loss treatment. They advertise on a social media site that they are looking for volunteers to participate. There is sampling bias because the sample will be limited to people who use the social media site where they advertised. The individuals who choose to participate may be different from the overall population. For example, volunteers may be individuals who are already actively trying to lose weight. This is not a representative sample because the sample may have characteristics that are different from the population of interest.

Example: NYC Advertising Study

Clothing Store

The marketing department for a large retail chain wants to survey their customers about a new advertising plan. They go into one of their largest New York City stores on a Tuesday morning and survey the first 50 people who make a purchase. There is sampling bias for a number of reasons. They are only sampling at one store, in New York City; there may be differences between the customers at this store and those that shop at their other locations. By conducting their survey on a Tuesday morning they are limiting themselves to individuals who are out shopping at that time; the sample may lack people who work during the day. Finally, they only survey people who make a purchase; individuals who do not make a purchase, perhaps because they are not satisfied with the store, will not be included in their sample. This is not a representative sample because the sample selected may be different from the population of interest.


1.2.2 - Sampling Methods

1.2.2 - Sampling Methods

There are many different ways to select a sample from a population. Some of these methods are probability-based, such as the simple random sampling method that you'll read about below and in your textbook. Other probability-based methods include cluster sampling methods and stratified sampling methods. You may learn more about these if you take a research methods course in the future. Other sampling methods are not probability-based, such as convenience sampling methods which you'll read about below.

Simple Random Sampling

To prevent sampling bias and obtain a representative sample, a sample should be selected using a probability-based sampling design which gives each individual a known chance of being selected. The most common probability-based sampling method is the simple random sampling method.

Using this method, a sample is selected without replacement. This means that once an individual has been selected to be a part of the sample they cannot be selected a second time. If multiple samples are being taken (e.g., when constructing a sampling distribution in Lesson 4), an individual can appear in more than one sample, but only once in each sample.

Simple Random Sampling
A method of obtaining a sample from a population in which every member of the population has an equal chance of being selected

Examples: Community Service Attitudes

An institutional researcher is conducting a study of World Campus students’ attitudes toward community service. He takes a list of all 12,242 World Campus students and uses a random number generator to select 30 students whom he contacts to complete the survey. This researcher used simple random sampling because participants were selected from the overall population in a way that each individual had an equal chance of being selected.

Example: Languages

A student wants to learn more about the languages spoken in her town. She has access to the census forms submitted by all 3,500 households in her town. It would take too long for her to go through all 3,500 forms, so she uses a random number generator to select 100 households. She finds those 100 census forms and records data concerning the languages spoken in those households. This is a simple random sample because the sample of 100 households was selected in a way that each of the 3,500 households had an equal chance of being selected.

Convenience Sampling

While probability-based sampling methods are considered better because they can prevent sampling bias, there are times when it is not possible to use one of these methods. For example, a researcher may not have access to the entire population. In cases were probability-based sampling methods are not practical, convenience samples are often used.

Convenience Sampling
A method of obtaining a sample from a population by ease of accessibility; such a sample is not random and may not be representative of the intended population.

Example: Weight Loss Supplements

A weight loss company wants to compare how much weight adults lose on their supplement versus a competitor's supplement. To recruit participants, they post an advertisement in a newspaper asking for adults who want to lose weight. This is an example of a volunteer sample which is a convenience sampling method. The researchers are using a sample of individuals who volunteer to participate.

Example: Chocolate Preferences

A chocolate company wants to know if customers prefer their dark chocolate with or without peanuts. They set up a table in a grocery store on a Monday morning, offer customers samples of their dark chocolate with and without peanuts, and ask which they prefer. This is an example of a convenience sampling method. The sample is not being selected using any probability-based method and may not be representative of the company's intended population. People who grocery shop may be a special subset of the population.  For example, people who do not work traditional full-time jobs may be more likely to grocery shop at that time. The researchers are using a sample of individuals who happen to be grocery shopping on a Monday morning and who volunteer to eat their chocolate.


1.2.2.1 - Minitab Express: Simple Random Sampling

1.2.2.1 - Minitab Express: Simple Random Sampling

Using simple random sampling methods, each member of the population has an equal chance of being selected. We can use statistical software to select a simple random sample.

In the example below we will randomly select 10 names from a class list.

MinitabExpress  – Random Sampling from a Column

We could place those names into a column in Minitab Express.

Open the following data set:

and randomly select 10 using Minitab Express by:

  1. On a PC or Mac select DATA > Sample from Columns
  2. Double-click on the variable Name in the box to the left to insert it into the "Take a sample from the following columns" box.
  3. In the box labeled "Number of rows in each sample", enter 10.
  4. By default, leave the method as "Sample without replacement". 
  5. Click OK.

The result should be the following output:

Summary
Input
Source data column Name
Number of rows sampled 10
Method Without replacement
Output Summary
Output
Sampled data column C2
10 rows were sampled from Name and stored in C2.

Along with a random sample of the names in the second column in the data worksheet:

Sample Data
  C1 C2
  Name Sample From Name
1 Beckman Qi
2 Beeson Song
3 Boone Walia
4 Botero Gruver
5 Brooks Corey
6 Brown Cingolani
7 Campbell Farooq
8 Cao Yan
9 Chen O'Donnell
10 Chen Wang
11 Chung  

Since we are using simple random sampling procedures, the results will be different each time due to random sampling variation. Try these steps a few times, you should see that you get a different set of 10 names each time.

Video Walkthrough

Select your operating system below to see a step-by-step guide for this example:


1.3 - Other Sources of Bias

1.3 - Other Sources of Bias

On the previous pages you learned about sampling bias and how simple random sampling methods can be used to avoid sampling bias. Here, we will discuss two other sources of bias: non-response bias and response bias. These are both problems that should be prevented in the design of a research study.

Non-Response Bias
Systematic favoring of certain outcomes that occurs when the individuals who choose participate in a study differ from the individuals who choose to not participate
Response Bias
Systematic favoring of certain outcomes that occurs when participants do not respond truthfully; they may do so to align with social norms or to appease the researcher

Example: Restaurant Experience Survey

A restaurant invited their recent customers to complete an online survey. Customers who had really strong feelings about their experience, either positive or negative, were very likely to complete the survey while customers who had a neutral experience were much less likely to complete the survey. This is an example of non-response bias because the individuals who chose to participate differed from those who chose to not participate.

Example: Retail Store Hours

A retail store was considering expanding their operating hours. To determine if this was a need perceived by their customers, they conducted a survey over the telephone to obtain data. Research assistants called the phone numbers of customers who were randomly selected to participate between the hours of 9AM and 4PM. Individuals who were at work were less likely to answer their phone call or agree to participate in the study than individuals who were at home at that time. This is an example of non-response bias because the individuals who responded to the survey were different from individuals who did not respond in terms of their work schedule.

Example: Sexual Activity Survey

A psychologist is conducting a research study concerning sexual activities. The survey is administered over the phone and many of the questions are personal. Some participants feel uncomfortable and do not answer honestly. This is an example of response bias because the participants are not responding truthfully; instead their responses are biased toward what they perceive as being socially acceptable.

Example: Cheating in Class

Using an anonymous online survey, a professor asks his students “Have you cheated on an exam in my class?” Many of the students who have cheated still answered “no.” This is an example of response bias because the participants are not responding truthfully; instead their responses are biased toward responses that are less likely to get them in trouble.


1.4 - Research Study Design

1.4 - Research Study Design

Experimental and Observational Designs

Research studies are often classified in terms of their designs. Here, we will make the distinction between experimental and observational research designs.

Experimental Research Design

A study in which the researcher manipulates the treatments received by subjects and collects data; also known as a scientific study

Observational Research Design

A study in which the researcher collects data without performing any manipulations; also known as a non-experimental study

Example: Caffeinated Coffee Studies

Coffee cup

An organization wants to know if drinking caffeinated coffee causes hyperactivity in college students. To test their research question, they select a sample of college students and give them a survey concerning their intake of caffeinated coffee and their hyperactivity levels. This is an observational study because the researchers are not making any manipulations. They are observing what is happening without intervening. This is not an experiment because no treatment was imposed by the researchers.

Another organization also wants to know if drinking caffeinated coffee causes hyperactivity in college students. They design a different study. They select a random sample of college students and randomly assign them to drink coffee with or without caffeine. The researchers observe the students' behaviors. This is an experimental study because a treatment is being imposed. The researchers are manipulating the treatment that each participant receives.

On Your Own

A team of researchers want to know if Advil or Tylenol is more effective.

Think about the following data collection methods, then click on the method to compare your answers.

Method 1
Researchers survey a sample of adults and ask if they use Advil or Tylenol. They ask them to rate the effectiveness of the one they use. Is this an observational study or experimental study?
This is an observational study because the researchers observed the difference between two existing groups (Advil and Tylenol users). The researchers did not manipulate the participants' experiences. 
Method 2
Researchers obtain a random sample of adults. They randomly assign half of the participants to take Advil and the other half to take Tylenol. They ask each participant to rate the effectiveness of the one that they were assigned to take. Is this an observational study or experimental study?
This is an experimental study because the researchers assigned participants to groups. 

1.4.1 - Confounding Variables

1.4.1 - Confounding Variables

Experimental studies are typically preferred over observational studies because they allow for more control. A common problem with observational studies is that there may be other variables influencing the results that the researchers were not able to take into account. These are known as confounding variables.

Confounding Variable

Characteristic that varies between cases and is related to both the explanatory and response variables; also known as a lurking variable or a third variable

Example: Ice Cream & Home Invasions

There is a positive relationship between ice cream sales and home invasions (i.e., as ice cream sales increase throughout the year so do home invasions). It is clear that increases in ice cream sales do not cause home invasions to increase, and home invasions do not cause an increase in ice cream sales. There is a third variable at play here: outdoor temperature. When the weather is warmer both ice cream sales and home invasions increase. In this case, outdoor temperature is a confounding variable.


1.4.2 - Causal Conclusions

1.4.2 - Causal Conclusions

In order to control for confounding variables, participants can be randomly assigned to different levels of the explanatory variable. This act of randomly assigning cases to different levels of the explanatory variable is known as randomization. An experiment that involves randomization may be referred to as a randomized experiment or randomized comparative experiment. By randomly assigning cases to different conditions, a causal conclusion can be made; in other words, we can say that differences in the response variable are caused by differences in the explanatory variable. Without randomization, an association can be noted, but a causal conclusion cannot be made.

Note that randomization and random sampling are different concepts. Randomization refers to the random assignment of experimental units to different conditions (e.g., different treatment groups). Random sampling refers to probability-based methods for selecting a sample from a population.

Randomization
The act of randomly assigning cases to different levels of the explanatory variable
Causation
Changes in one variable can be attributed to changes in a second variable
Association
A relationship between variables

1.4.3 - Independent and Paired Samples

1.4.3 - Independent and Paired Samples

In both observational and experimental studies, we often want to compare two or more groups. When comparing two or more groups, cases may be independent or paired.

Independent Groups
Cases in each group are unrelated to one another.
Paired Groups

Cases in each group are meaningfully matched with one another; also known as dependent  samples or matched pairs

Example: Exam Scores

An instructor wants to compare students' scores on the midterm and final exam. This is most often done by obtaining a sample of students and recording each student's midterm exam score and final exam score. In other words, there would be two measurements for each student. This is an example of a matched pairs design because data would be paired by student. 

Example: Shoes

A shoe company is studying how many shoes Italian men and women own. In one research study they take a random sample of 500 Italian adults and ask each individual if they identify as a man or women and how many pairs of shoes they own. The men and women in this study are in two independent groups. 

In a second study the researchers use a different design. This time they take a random sample of 250 heterosexual married couples in Italy (i.e., 250 husbands and 250 wives). They record the number of shoes owned by each husband and each wife. This is an example of a matched pairs design. Data are paired by couple.


1.4.4 - Control and Placebo Groups

1.4.4 - Control and Placebo Groups

control group is an experimental condition that does not receive the actual treatment and may serve as a baseline. A control group may receive a placebo or they may receive no treatment at all. A placebo is something that appears to the participants to be an active treatment, but does not actually contain the active treatment. For example, a placebo pill is a sugar pill that participants may take not knowing that it does not contain any active medicine. This can lead to a psychological phenomena called the placebo effect which occurs when participants who are given a placebo treatment experience a change even though they are not receiving any active treatment. Researchers use placebos in the control group to determine if any differences between groups are due to the active medicine or the participants' perceptions (the placebo effect).

Control Group
A level of the explanatory variable that does not receive an active treatment; they may receive no treatment or a placebo
Placebo Group
A group that receives what, to them, appears to be a treatment, but actually is neutral and does not contain any active treatment (e.g., a sugar pill in a medication study)

Example: Vitamin B Energy Study

Researchers want to know if adults who consume a drink that is high in vitamin B-12 have increased energy. They obtain a representative sample of adults. All participants are given a drink that they are told to consume every morning. They are not told what is in the drink. Half are given a drink that is high in vitamin B-12 while the other half are given a drink that tastes the same but contains no vitamin B-12.

The participants who received the drink with no vitamin B-12 are the placebo group. The purpose of the placebo group in this study is to make the two groups equivalent except for the presence of the vitamin B-12. By comparing these two groups, the researchers will be able to determine what impact the vitamin B-12 had on the response variable. We could also say that this served as a control group because this group did not receive any active ingredients. 


1.4.5 - Blinding

1.4.5 - Blinding

Blinding techniques are also used to avoid bias. In a single-blind study the participants do not know what treatment groups they are in, but the researchers interacting with them do know. In a double-blind study, the participants do not know what treatment groups they are in and neither do the researchers who are interacting with them directly. Double-blind studies are used to prevent researcher bias. 

Blinding
Procedure employed in research to prevent bias in which the participants and/or the researchers interacting with the participations do not know which treatment each case is receiving
Single-Blind Study
Research study in which the participants do not know the treatment group that they have been assigned to
Double-Blind Study
Research study in which neither the participants nor the researchers interacting with them know which cases have been assigned to which treatment groups

Example: Yogurt Tasting

Researchers are comparing a low-fat blueberry yogurt to a high-fat blueberry yogurt. Participants are randomly assigned to receive one type of yogurt. After tasting it, they complete an online survey. The researchers know which yogurt containers are low-fat and which are high-fat, but participants are not told. This is an example of a single-blind study because the researchers know which participants are in the low- and high-fat groups but the participants do not know. A double-blind study may not be necessary in this case since the researchers have only minimal contact with the participants. 

Example: Caffeine Energy Study

Researchers want to know if adult males who consume high amounts of caffeine interact more energetically. They obtain a representative sample and randomly assign half of the participants to take a caffeine pill and half to take a placebo pill.  The pills are randomly numbered and coded so at the time the researchers do not know which participants have been given caffeine and which have been given the placebo. All participants are told that they may have been given a caffeine pill. After taking the pill, researchers observe the participants interacting with one another and rate the interactions in terms of level of energy. 

This is a double-blind study because neither the researchers nor the participants know who is in which group at the time the data are collected. After the data are collected, researchers can look at the pill codes to determine which groups the participants were in to conduct their analyses. A double-blind study is necessary here because the researchers are observing and rating the participants. If the researchers know who is in the caffeine group they may be more likely to rate their levels of energy as very high because that is consistent with their hypothesis. 


1.5 - Lesson 1 Summary

1.5 - Lesson 1 Summary

Lesson 1: Learning Objectives

Upon successful completion of this lesson, you will be able to:

  • Identify cases and variables in a research study
  • Classify variables as categorical or quantitative
  • Identify explanatory and response variables in a research study
  • Distinguish between a sample and a population
  • Determine whether a given sample is representative of the intended population 
  • Identify simple random sampling and convenience sampling methods
  • Use Minitab Express to draw a simple random sample from a known population
  • Identify potential non-response and response bias
  • Distinguish between experimental and observational designs
  • Identify confounding variables
  • Identify randomized experiments
  • Determine when causal conclusions (as opposed to associations) can be made
  • Classify samples as being independent or paired
  • Identify control groups, placebos, and blinding in research studies and explain why each is used

In this lesson you learned about how data are collected. You were introduced to terminology that will be used throughout the course and you examined different types of research study designs (experimental and observational), sampling methods, and sources of bias. You learned that in order to make generalizations from a sample to a population the sample must be representative of the population; ideally the sample should be randomly selected using a probability-based sampling method such as a simple random sampling. In order to make a causal conclusion, randomization is required.


Legend
[1]Link
Has Tooltip/Popover
 Toggleable Visibility