10: Hypothesis Testing

Lesson Overview Section

In Lesson 2 we saw the value of random assignment in designed experiments. Random assignment alleviates the bias that might cause a systematic difference between groups unrelated to the treatment itself. Precautions like blinding that ensure that the subjects are treated the same during the experiment then leave us with just two possibilities for the cause of differences seen between groups. Either:

  • the treatment was effective in producing the changes (the research hypothesis), or
  • differences were just the result of the luck of the draw (the null hypothesis).

This shows the importance of addressing the concept of statistical significance. If it is very unlikely that the results of a randomized experiment are just the result of random chance, then we are left with the treatment itself as the probable cause of any relationship seen. Even in an observational study, being able to show that random chance is a poor explanation of the data is still good evidence for a true association in the population (even though it is poor evidence of causality).

This lesson focuses on Statistical hypothesis testing. In a significance test, you carry out a probability calculation assuming the null hypothesis is true to see if random chance is a plausible explanation for the data. Let's illustrate the process with an example.

Example 10.1 Section

A penny balance up on its side

Physical theory suggests that when a coin is spun on a table (rather than flipped in the air) the probability it lands heads up is less than 0.5. We are hesitant to believe this without proof.

To test the theory we carry out an experiment and independently spin a penny 100 times getting 37 heads and 63 tails. Thus, the observed proportion of heads is 37 / 100 = 0.37

We have two possible explanations for the data:

Null Hypothesis: The data is merely a reflection of chance variation. The probability of heads when a penny is spun is really p = 0.5

vs.

Alternative Hypothesis: The probability of heads when a penny is spun is really < 0.5.

 

A statistical hypothesis test is designed to answer the question: "Does the Null Hypothesis provide a reasonable explanation of the data?”

To answer this question we carry out a probability calculation. First, we can calculate a

Test Statistic = a measure of the difference between the data and what is expected when the null hypothesis is true.

In our example, the null hypothesis says the number of heads in 100 spins would closely follow the normal distribution with p = 0.5. So, if the null hypothesis is true, we expect half (0.5) heads give or take a standard deviation of

\[\sqrt{\frac{0.5(1-0.5)}{100}}=0.05\]

Further, we can see how unusual our data is if the null hypothesis is true by finding the standard score z for the test statistic and using the normal curve:

\[z = (0.37-0.5)/0.05 = -2.6\]

How unusual is the value we got, assuming the null hypothesis (i.e., the real proportion is 0.5) is true? We know that standard scores of -2.6 or lower only happen about 0.5% of the time. So the null hypothesis provides a poor explanation for our data. This would seem to provide strong evidence that spinning a coin has less than a 50% chance of landing heads.

Objectives

After successfully completing this lesson, you should be able to:

  • Formulate appropriate null and alternative hypotheses.
  • Identify the type 1 and the type 2 error in the context of the problem.
  • Use  the four basic steps to carry out a significance test in some basic situations.
  • Interpret a p-value in terms of the problem.
  • State an appropriate conclusion for a hypothesis test.