Example 1: Accuracy of Prostate Cancer Screening Tests Section
Two methods are commonly used to screen for prostate cancer: PSA (a blood test), and digital rectal exam (DRE). In this example, researchers used an abnormal PSA as an indicator for prostate cancer, using a cut-off of 4.0 micrograms per milliliter. The researchers found that this test had a sensitivity of 0.67 or 67%. In other words, two-thirds of all the cases that truly have prostate cancer were detected. One-third of the cases of prostate cancer goes undiagnosed. On the other hand, when the PSA test indicated no disease, in almost all cases there was no disease. The specificity was 97%.
Test Characteristics of PSA and DRE | |||
---|---|---|---|
Sensitivity | Specificity | Positive Predictive Value | |
Abnormal PSA (> 4.0 micrograms/milliliter) |
0.67 | 0.97 | 0.43 |
Abnormal DRE | 0.50 | 0.94 | 0.24 |
Adapted from: Kramer BS, Brown ML, Prorok PC, Potosky AL, Gohagan JK. Prostate cancer screening: what we know and what we need to know. Ann Int Med 1993;119:914-923
PSA then was very good for giving a 'clear' prognosis but was not a very good test for detecting disease (only 43% of the positive results actually had prostate cancer) For a positive result, the clinician will perform a follow-up procedure.
The other common test is the digital rectal exam (DRE). DRE has a low sensitivity of 50%, and the specificity is also lower than PSA. Positive predictive value is even worse than PSA.
Example 2: Accuracy of One or Two INDEPENDENTLY Administered Prostate Cancer Screening Tests Section
What if you used two tests? Let's add these data to the table:
Test Characteristics of PSA and DRE | |||
---|---|---|---|
Sensitivity | Specificity | Positive Predictive Value | |
Abnormal PSA (> 4.0 micrograms/milliliter) |
0.67 | 0.97 | 0.43 |
Abnormal DRE | 0.50 | 0.94 | 0.24 |
Abnormal DRE or Abnormal PSA | 0.84 | 0.92 | 0.28 |
Abnormal DRE and Abnormal PSA | 0.34 | 0.995 | 0.49 |
Adapted from: Kramer BS, Brown ML, Prorok PC, Potosky AL, Gohagan JK. Prostate cancer screening: what we know and what we need to know. Ann Int Med 1993;119:914-923
If either one or the other test is positive, then sensitivity is increased, but specificity reduced.
What if both tests are positive? Will using a higher standard for declaring disease results in lower sensitivity? This is the case, sensitivity goes down while the test becomes more specific. This approach does produce the highest positive predictive value.
If the goal is to have the highest positive predictive value, the best choice is to require both tests to be abnormal. However, what is the consequence of letting two-thirds of prostate cancer cases go undiagnosed? PSA is obtained from a simple blood draw. DRE is uncomfortable but temporary, so there is not much long-term consequence from either of the test procedures. Abnormal test results, however, are often followed by biopsies which are costly, uncomfortable, and have significant co-morbidities associated with them. We don't want to put men through the follow-up unnecessarily. Prostate cancer is often slow-growing and is not communicable. If the consequences of watchful waiting are not great, we may be willing to let a sizeable proportion of men who actually have prostate cancer go undiagnosed. The choice of tests and how to use them for screening purposes is heavily influenced by the consequences of making a wrong decision.
The PSA and DRE examples used two independent tests. What if a test is performed in a series?
Example 3: Accuracy of Two Screening Tests Administered in Series Section
Consider a population in which there are 500 diabetic individuals among a total population of 10,000, i.e. a 5% prevalence. Suppose you administer a non-fasting blood sugar test with a sensitivity of 350/500 (70%) and a specificity of 7600/9500 (80%).
Screening Test A | |||||
---|---|---|---|---|---|
Blood Sugar | Diabetes | ||||
+ | - | Total | |||
+ | 350 | 1900 | 2250 | Sensitivity = | |
- | 150 | 7600 | 7750 | Specificity = | |
Total | 500 | 9500 | 10000 | ||
Screening Test B | |||||
GTT | Diabetes | ||||
+ | - | Total | |||
+ | 315 | 190 | 505 | Sensitivity = | |
- | 35 | 1710 | 1745 | Specificity = | |
Total | 350 | 1900 | 2250 | Prevalence = | |
Comparison | |||||
Net Sensitivity = 315/500 = 63% | |||||
Net Specificity = 7600+1710/9500=98% | |||||
Prevalence = 500/10000=5% |
Stop and Think!
Take a look at the bottom half of the table: Of 2250, 350 have the disease. The second test has much higher sensitivity and specificity, doesn't it? (90% and 90%) To perform a glucose tolerance test (GTT), the subject fasts overnight then comes to the clinic, drinks a glucose solution, the amount determined by their weight, then blood is drawn at regular intervals and assessed for evidence of regulation of blood glucose. A GTT requires considerably greater resources than drawing blood for a blood sugar test. It makes sense to put these two tests in a series, GTT after blood glucose.
We started with 10,000 people, 315 of these we have labeled as positive out of the total of 500 that have diabetes. Net sensitivity for the series is 63%, 315 out of 500. Net specificity includes the 7600 persons correctly identified as negative with the first test plus the 1710 individuals who were ruled out with the second test divided by the total of those without diabetes, 9500, for a net specificity of 98%.
The net specificity is much higher by using the two tests in a series than by just using the first test in a population with a prevalence of 5%. A significant advantage is gained from performing the simple test upfront, identifying individuals who are positive, and following up in this group with the more complex and costly test.