Skip to main content
  • Using Federated Learning to Standardize ROP Diagnosis

    By Jean Shaw
    Selected by Andrew P. Schachat, MD

    Journal Highlights

    Ophthalmology Retina, August 2022

    Download PDF

    Federated learning (FL) is an approach used to train deep learning (DL) mod­els across institutions without sharing data between them. Hanif et al. set out to determine whether FL could be used to delineate institutional differences in the diagnosis of retinopathy of prema­turity (ROP). They found differences in the clinical diagnoses of plus disease and overall levels of ROP severity be­tween institutions. For this informatics study, a total of 867 patients (1,686 eyes) across seven institutions were represented by the 5,245 wide-angle retinal images from neonatal ICUs of the participating institutions. The images were labeled with the clinical diagnoses of plus dis­ease (plus, preplus, and no plus), which were documented in the chart, and a reference standard diagnosis (RSD) was determined by three graders and the clinical diagnosis.

    The researchers trained a DL model for plus disease classification, using the clinical labels. The three clinical prob­abilities were then converted into a vascular severity score (VSS) for each eye exam, as well as an “institutional VSS” (for the latter, the average of the VSS values assigned to patients’ worse eyes at each examination was calculated for each institution). Demographics, clinical diagnoses of plus disease, and institutional scores were compared using multiple analytic methods.

    Overall, the proportion of patients who developed plus disease was 8.1% by clinical diagnosis and 8.2% by RSD. However, there were significant differ­ences between the clinical diagnosis of plus and an RSD of plus at the institu­tional level, ranging from 0% at one site to 8.9% at another—and for preplus disease, this variance at the institutional level ranged from 2.08% at one site to 15.6% at another. Moreover, the researchers found that the institutional VSS varied significantly by site.

    These findings suggest variability in diagnostic paradigms among clinicians, the researchers said, and they suggested that an FL approach holds promise for objectively assessing such variability as well as any differences in disease severi­ty between individual institutions. (Also see related commentary by Daniel Shu Wei Ting, MD, PhD, in the same issue.)

    The original article can be found here.