The goal of this section is to help the reader interpret diagnostic and screening tests. The first example presents a relatively straightforward case; it involves a screening test with a binary (yes/no) outcome, a disease that the patient either has or does not have, and a patient about whom nothing is known at the time of screening. The subsequent discussions examine complicating features that often occur in ophthalmic practice and in research. The reader should consider these complicating features when evaluating results of diagnostic and screening tests.
The Straightforward Case
A fictitious study evaluates use of a simple, quick strabismus test in 100 children, comparing it to a longer, more expensive full examination with a pediatric ophthalmologist as the gold standard. The study finds that 30 children have strabismus and 70 do not. However, after undergoing the quick screening test, 60 children have abnormal results and 40 children have normal results. Table 1-1 shows the screening test result data. The screening test performance is described as follows:
-
Sensitivity: The test correctly identifies 20 of every 30 children who have strabismus (67%). The equation is a/(a + c). The denominator, (a + c), represents all of the test subjects who have the disease (strabismus).
-
Specificity: The test correctly identifies 30 of every 70 children who do not have strabismus (43%). The equation is d/(b + d). The denominator (b + d), represents all of the test subjects who do not have the disease (normal).
-
Positive predictive value (PPV): If a child’s test results are abnormal, there is only a 1 in 3 chance (20/60) that the child actually has strabismus (33%). The equation is a/(a + b). The denominator, (a + b), represents all of the subjects with abnormal test results.
-
Negative predictive value (NPV): If a child’s test results are normal, there is a 3 in 4 chance (30/40) that the child is actually disease-free (75%). The equation is d/(d + c). The denominator, (d + c), represents all the test subjects with normal test results.
-
Accuracy: The screening test is correct in 50 of 100 cases (50%). The equation to determine accuracy is (a + d)/(a + b +c + d).
Table 1-1 Results for Strabismus Screening Test in Clinic
Table 1-2 Results for Strabismus Screening Test in Shopping Center
Sensitivity is the percentage of test subjects who both have the disease of interest and have abnormal test results, and specificity is the percentage of disease-free people who have normal results. However, it is also important to remember that neither sensitivity nor specificity takes into account the prevalence of disease in the study population.
Table 1-2 illustrates the performance of the hypothetical strabismus test if it yields the same results (60 children with abnormal test results and 40 children with normal test results) when performed in a shopping center where the prevalence of strabismus is only 3% (much lower than in the situation previously discussed). The sensitivity is still 67%, and the specificity is about the same, at 40%. However, because of the high number of falsely abnormal results, 58 children without disease and only 2 children who truly have strabismus would be referred for complete examinations. In this example, the PPV is only 3% (2/60). The NPV is 98% (39/40). Because of the low prevalence of strabismus in this setting, most children whose test results were abnormal would actually be disease-free. This increases the costs of unnecessary follow-up testing and increases anxiety for the parents. Clearly, the prevalence of disease in the population of interest and the screening test’s PPV and NPV should be considered before the test is used for screening a population.
Choosing a gold standard is a key aspect of conducting a diagnostic testing study. The reader of such a study should ascertain whether the gold standard (in this case a pediatric ophthalmologist) was masked to the results of the strabismus test; if not, this may have created confirmatory bias, potentially artificially increasing the diagnostic precision of the screening test. The gold standard should also have been previously published and accepted by contemporaneous experts. Finally, the gold standard should be repeatable under the same conditions; for example, would the pediatric ophthalmologist come to the same diagnosis (strabismus vs. no strabismus) if they examined the child a second time? In conclusion, the gold standard should be scrutinized for its applicability to the clinical situation.
Excerpted from BCSC 2020-2021 series: Section 1 - Update on General Medicine. For more information and to purchase the entire series, please visit https://www.aao.org/bcsc.