5.3 - Randomization Procedures5.3 - Randomization Procedures
Like bootstrapping procedures, randomization procedures use resampling techniques to construct a sampling distribution that can be used to make inferences about the population. What makes a randomization distribution different is that it is constructed given that the null hypothesis is true. The randomization distribution will be centered on the value in the null hypothesis.
StatKey can be used to construct a randomization distribution for a single mean, single proportion, difference in means, difference in proportions, the slope of a simple linear regression model, or a correlation (Pearson's r). Minitab Express can conduct a randomization test for a single mean, single proportion, or difference in means.
The video below walks through an example of using StatKey to construct a randomization distribution. It also looks ahead to the next section and uses that randomization distribution to determine the p-value.
These are the steps that we will be using to conduct hypothesis tests this semester:
- Determine what type of test you need to conduct and write the hypotheses.
- Construct a randomization distribution under the assumption that the null hypothesis is true.
- Use the randomization distribution to find the p-value.
- Decide if you should reject or fail to reject the null hypothesis.
- State a real-world conclusion in relation to the original research question.
Here, you learned how to complete Step 2. On the next page you will learn how to use this randomization distribution to complete Steps 3 through 5.
5.3.1 - StatKey Randomization Methods (Optional)5.3.1 - StatKey Randomization Methods (Optional)
The following information goes beyond what you are expected to know for this course. Here, details about all of the randomization procedure options available in StatKey are covered. In STAT 200 you will always be using the default randomization methods. The information here is optional and is meant to provide extra details to individuals who are interested in learning more, beyond what is required of most introductory statistics courses.
Randomization Test for One Mean
In StatKey there is only one method for conducting a randomization test for one mean. The sample is shifted so that the sample mean equals the hypothesized population mean (i.e., the value in the null hypothesis). Samples of the same size as the original sample are drawn with replacement from the shifted distribution and the mean of each randomization sample is recorded on the randomization distribution dotplot.
Randomization Test for One Proportion
In StatKey there is only one method for conducting a randomization test for one proportion. Samples of the same size as the original sample are drawn from a theoretical distribution with a proportion equal to the hypothesized population proportion (i.e., the value in the null hypothesis).The sample proportion in each randomization sample is recorded on the randomization distribution dotplot.
Randomization Test for a Difference in Means
StatKey offers three randomization methods when comparing the means of two independent groups: reallocate groups, shift groups, and combine groups. In this course we will always be using the default method of reallocating groups. For larger sample sizes results will be relatively consistent across the three methods. In practice, the method that is most appropriate may depend on the design of the research study. For example, the reallocation method may be preferred in studies where participants were randomly assigned to different conditions.
- Reallocate Groups
This is the default method in StatKey. In this course, this is always the method that will be used. Using the reallocate method, all cases in the samples are combined and then randomly assigned to the two groups with the same sample sizes as the original samples. This is done without replacement. The mean of each reallocated sample is recorded. The difference between those reallocated sample's means is recorded on the randomization distribution dotplot.
- Shift Groups
The two groups are shifted until their observed sample means are equal. This is similar to the method used for one sample mean. After the groups are shifted, cases are randomly selected from the first group, with replacement, until a randomization sample of the same size as the first group's original sample is obtained. This procedure is followed for the second group as well. The difference between the mean of the first group and the mean of the second group is recorded on the randomization distribution plot.
- Combine Groups
All cases in the samples are combined and then randomly selected with replacement. Again, the sample sizes for each group will be equal to each group's original sample size. The difference between this combine groups method and the default reallocate groups method is that this method resamples with replacement so an original case can appear more than once in a group, in both groups, or not at all.
Randomization Test for a Difference in Proportions
StatKey offers two randomization methods when comparing the proportions of two independent groups: reallocation and resampling. In this course we will be using the default reallocation method.
This procedure is the same as the reallocate groups procedure for two group means. All cases are combined and then randomly assigned to between the two groups with the same sample sizes as the original samples. This is done without replacement so the total number of successes between the two groups will always be equal to the total number of successes between the two groups.
The two groups are combined and the overall observed proportion is computed. Samples of the same size as the original samples are drawn from a theoretical distribution with a proportion equal to the overall observed proportion. The differences between the sample proportions in the two randomization samples are recorded on the randomization distribution dotplot.
Randomization Test for a Slope, Correlation
The randomization methods used for testing the slope and correlation are the same as both procedures involve two quantitative variables. In each case, the pairs of x and y variables are separated and randomly assigned to new pairs. The slope or correlation between those new pairs is computed and recorded on the randomization distribution plot. Like the other reallocation methods, this is done without replacement so each case's x value and y value are only selected once.
- Online discussion including textbook author Robin Lock
- Lock Morgan, K., Lock, R. H., Frazer Lock, P., Lock, E. F., Lock, D. F. (2014). StatKey: Online tools for bootstrap intervals and randomization tests. Paper presented at the International Conference on Teaching Statistics (ICOTS9). Flagstaff, AZ.