4.1 - Mistakes in Statistical Testing

Printer-friendly versionPrinter-friendly version

The table below shows the possible correct and incorrect outcomes for a single test:

  Not Significant  Significant
\(H_0\) True Smile type I Error
\(H_0\) False type II Error Smile

A type I error is called a false discovery. A type II error is called a false non-discovery. Generally false discoveries are considered to be more serious than false nondiscoveries, although this is not always the case. Investigators usually follow-up with discoveries, so that false discoveries can lead to misleading and expensive follow-up studies. But nondiscoveries are usually abandoned, so that false nondiscoveries can lead to missing potentially important results.

In high throughput studies we typically test each of our m features individually, leading to the following table:


  Not Significant  Significant  
\(H_0\) True U V \(m_0\)
\(H_0\) False T S \(m-m_0\)
  W R \(m\)


The total errors are T + V.

The false discovery proportion FDP is V / R.
The false discovery rate is the expected value of V / R, given that R ≠ 0.

Similarly, the false nondiscovery proportion FNP is T / W.
The false nondiscovery rate is the expected value of T / W, given that RW≠ 0.

\(\pi_0=m_0 / m\) is the proportion of null tests.

Before 1995, the objective of multiple testing correction was to control Pr(V > 0), the so-called Family-Wise Error Rate (FWER).

The problem is: as \(m_0\) grows, so does Pr(V > 0) for any fixed cut-off one would use to ascertain statistical significance.