Detecting Glaucomatous Optic Neuropathy via Deep Learning
Ophthalmology, August 2018
Li et al. devised an automated algorithm for classifying glaucomatous optic neuropathy (GON), based on color fundus photographs, and tested its disease-detection ability. They found the system to be highly sensitive and specific for detecting referable GON.
For this study, the authors used 48,116 fundus photographs to create and evaluate a new deep learning algorithm. Twenty-one trained ophthalmologists graded the photographs as unlikely, suspect, or certain GON. First, each image was assigned randomly to a single ophthalmologist and subsequently to additional graders until 3 consistent grades were obtained. The consensus grade was considered the conclusive grade for the image.
Referable GON was defined as suspect or certain GON having a vertical cup-to-disc ratio of ≥0.7 and other typical traits of GON. A separate validation dataset of 8,000 fully gradable fundus photographs was used to test the algorithm’s performance. Main outcome measures were area under the receiver operator characteristic curve (AUC), sensitivity, and specificity.
In the validation dataset, the deep learning system achieved AUC of 0.986, sensitivity of 95.6%, and specificity of 92.0%. False-negative grading (n = 87) of GON was most likely to occur with coexisting eye conditions (n = 44, 50.6%), particularly pathologic or high myopia (n = 37, 42.6%). The most common reason for false-positive grading (n = 480) was the presence of other eye conditions (n = 458, 95.4%). False-positive misclassification occurred in 22 eyes (4.6%) with a normal-appearing fundus.
Nearly all of the false-positives in this study resulted from abnormalities not related to GON—and more than half of the false-positive eyes had large cupping that required further investigation. The algorithm’s accuracy could be improved by augmenting the real-world patient data that accompany images so that the classification system mirrors the ground truth as closely as possible. Further research is needed to explore the utility of the algorithm for different populations and ophthalmic conditions. (Also see related commentary by Donald C. Hood, PhD, in the same issue.)
The original article can be found here.