7.2 - General Ideas for Testing Hypotheses
Step 0: Assumptions
- The samples must be independent and random samples.
- If two proportions, then the two groups must consist of categorical responses. If two means, each group must consist of quantitative responses. If paired, then the sample data must be paired
- If two proportions, then both samples must produce at least 5 successes and 5 failures. If two means or paired, then each group (two means) or the differences (paired) must come from an approximately normal population distribution. This is when the Central Limit Theorem can be applied: if the sample sizes for each group are at least 30, or if the number of paired differences is at least 30, then normality can be assumed.
Step 1: Write Hypotheses 1
The null hypothesis is that there is no difference between the two population parameters (no difference between proportions for categorical responses, no difference between means for quantitative responses). The alternative hypothesis may be one-sided or two-sided.
Step 2: Determine test statistic
In each case, test statistic = (sample statistic – 0) / (std. error of the statistic)
- For proportions, the test statistic should be labeled "z"
- For means, the test statistic should be labeled "t"
Step 3: Determine p-value
- For proportions, a standard normal distribution is used to find the p-value
- For means, a Student’s t distribution is used to find the p-values. Calculating degrees of freedom for a two-sample t-test depends on whether the the two population variances are considered "equal" or "unequal". This concept will be discussed later in these notes as well as the calculations for the appropriate degrees of freedom.
Step 4: Decide between hypotheses
If the p-value is less than or equal to alpha (α) – 0.05 is the usual level of significance – decide in favor of the alternative hypothesis. Otherwise, we cannot rule out the null as a possibility.
Step 5: State a "real world" conclusion.
When we decide for the alternative, we conclude that the two populations have a statistically significant difference in values for the parameter. If we cannot reject the null, we have to say that it is possible there’s no difference between the two populations.