4.1 - Mistakes in Statistical Testing

The table below shows the possible correct and incorrect outcomes for a single test:
Not Significant | Significant | |
\(H_0\) True | ![]() |
type I Error |
\(H_0\) False | type II Error | ![]() |
A type I error is called a false discovery. A type II error is called a false non-discovery. Generally false discoveries are considered to be more serious than false nondiscoveries, although this is not always the case. Investigators usually follow-up with discoveries, so that false discoveries can lead to misleading and expensive follow-up studies. But nondiscoveries are usually abandoned, so that false nondiscoveries can lead to missing potentially important results.
In high throughput studies we typically test each of our m features individually, leading to the following table:
Not Significant | Significant | ||
\(H_0\) True | U | V | \(m_0\) |
\(H_0\) False | T | S | \(m-m_0\) |
W | R | \(m\) |
The total errors are T + V.
The false discovery proportion FDP is V / R.
The false discovery rate is the expected value of V / R, given that R ≠ 0.
Similarly, the false nondiscovery proportion FNP is T / W.
The false nondiscovery rate is the expected value of T / W, given that RW≠ 0.
\(\pi_0=m_0 / m\) is the proportion of null tests.
Before 1995, the objective of multiple testing correction was to control Pr(V > 0), the so-called Family-Wise Error Rate (FWER).
The problem is: as \(m_0\) grows, so does Pr(V > 0) for any fixed cut-off one would use to ascertain statistical significance.