Lesson 4: Multiple Testing

Printer-friendly versionPrinter-friendly version
Key Learning Goals for this Lesson:
  • Understand the multiple testing problem
  • Interpret a histogram of p-values from independent tests
  • Be familiar with the Bonferroni method
  • Understand false discovery and non-discovery rates
  • Be familiar with some FDR estimation methods

An event that is rare if we have only one opportunity to observe can become quite common if we are observing thousands of events. For example, when you roll 2 fair dice, getting double sixes happens only about 1 out of 36 times. But if you roll 3600 times, you expect to get about 100 rolls with 2 sixes.

The p-value is the probability of obtaining a result at least as extreme as the observed result if the null hypothesis is true. Suppose we accept p < 0.05 as "extreme". If we do 10,000 (independent) tests, and all the null hypotheses are true, we expect about 5% of the tests (i.e. about 500) to have p < 0.05.

This is a huge problem in high throughput analysis, because we are usually doing thousands of tests. We do not want to waste our time following up false positive hypotheses. But if we use conventional p-value cut-offs, this will be inevitable.

This chapter discusses some approaches to correcting our inference methods when we are doing multiple tests.