Formulating the clinical question is the first step in resolving a diagnostic or management issue. Examples of clinical questions in ophthalmology include: What are the results of minimally invasive glaucoma surgeries in patients with low-pressure glaucoma versus higher-pressure glaucoma? Do racial and ethnic minority populations in the United States have a higher risk of proliferative vitreoretinopathy after pars plana vitrectomy? What is the expected survival of an endothelial graft in a patient with Fuchs dystrophy?
Clinicians can use various sources of information to research their answers to questions, including general textbooks on ophthalmology, review journals on specific subjects (eg, Survey of Ophthalmology [www.surveyophthalmol.com]), and educational material from the American Academy of Ophthalmology ([AAO]; www.aao.org/clinical-education) (eg, Preferred Practice Pattern guidelines, Focal Points modules). In addition, clinicians can use the Cochrane Library (www.cochranelibrary.com) to access high-quality meta-analyses regarding specific management issues (eg, surgery for nonarteritic ischemic optic neuropathy, intervention for involutional lower-eyelid ectropion).
Critical Reading of Studies
Before committing time to reading a published study, the clinician should review its abstract to ascertain whether the study addresses the question of interest. After reviewing the abstract, the clinician should read the rest of the study critically to determine the characteristics of the study population, recruitment strategy, sample size, intervention, outcomes of interest, and statistical methods used, and thus evaluate whether the study is valid and applicable to the clinical question.
Are the study population and recruitment strategy applicable to my patients?
Understanding a study’s population and its recruitment strategy is key to understanding the setting of the study. Was the trial clinic-based, multicenter, or community-based? In therapeutic trials, the inclusion and exclusion criteria describe the characteristics of those who were or were not treated with an intervention. Specific patient groups may have been excluded because they were considered a vulnerable population. For example, most trials of ocular hypotensive drugs exclude children and pregnant women; as a result, there are minimal data on the safety and efficacy of most ocular hypotensive agents in these 2 groups of patients. Thus, if a clinician wants to do research before deciding whether to use a specific ocular hypotensive agent in a pregnant woman or in a child, most of the evidence can be found only in individual case reports or retrospective case series.
The next step in evaluating a study is exploring whether the study created selection bias by assigning the intervention to certain participants. Was the intervention randomly assigned? Was the treated group comparable to the control group? The purpose of randomly assigning an intervention to participants is to minimize bias on the part of the investigators and the patient. For example, an investigator may create selection bias by inadvertently enrolling less complex patients for a new surgery, potentially biasing their outcomes toward better results.
Whether the selection was biased or random can be determined by examining whether the participants assigned to each group are similar in the characteristics that may affect the outcome of interest. For example, when evaluating a study assessing the effect of laser treatment versus anti–vascular endothelial growth factor (anti-VEGF) treatment on diabetic retinopathy, the clinician should examine whether patients’ hemoglobin A1c levels, blood pressure, and severity of disease are similar between study groups, because these factors may alter the progression of retinopathy. Use of a control group is also important because it indicates whether the results of the intervention are above and beyond beneficial effects of participants’ enrollment in a trial, which usually includes selected, motivated patients.
Clinical trials may study a narrow subset of a disease, making the results applicable and generalizable only to similar patients. A common error is extrapolating such data to apply to all patients or varying degrees of disease severity. For example, if a treatment is successful only in patients with mild glaucomatous damage who underwent trabeculectomy but not in those with advanced glaucomatous damage, the study results should be applied only to similar patients—in this case, patients with mild glaucomatous damage.
Was the sample large enough to detect a difference?
The sample size must have enough power to reject the null hypothesis, which states that there is no difference (in the outcome of interest) in the group that received the intervention compared with the group that did not receive the intervention. A study must have enough power to reject the null hypothesis. When this occurs, it suggests support for the “alternative hypothesis,” which states that a true difference exists between the groups. Power depends on the sample size (number of participants), the expected difference in the outcome of interest in the intervention group compared with the control group (eg, improvement in visual acuity, resolution of macular edema), and the variability (eg, standard deviation) of the outcome of interest. In general, an intervention with a larger treatment effect and smaller variability requires a smaller sample size. These characteristics should be reported in the Methods section of the study.
Are the treatments and outcomes clinically relevant?
The clinician should ascertain whether the study’s results can be applied to his or her patients. Questions to consider include the following:
-
Is the intervention available and applicable to the current practice environment?
-
Are the outcomes clinically important?
-
Are all clinically important outcomes evaluated?
-
Is the treatment difference clinically significant?
It is important to consider whether the intervention is useful in practice. It may be too expensive, too difficult to perform, or no longer in general use. If so, the study may pose little benefit to current clinical care.
Is the intervention reproducible?
The study should describe the intervention in enough detail to allow the experiment to be replicated. For example, a surgical study should explain all the steps of the procedure so that different surgeons are able to perform the procedure in the same manner in each case. Did all surgeons involved in the study perform it similarly, and were their results similar? Did the study include a training session before the start of the study, monitor specific aspects of the surgical procedure, and standardize postoperative care? In general, a study should avoid differences in study procedures except for the intervention of interest. In addition, to decrease the risk of investigator bias, the study should try to mask the observer to the intervention. Investigator bias may occur when the investigator expects a different result in the intervention group and adjusts his or her measurement of the outcome of interest to satisfy this expectation.
Is the outcome clearly defined and reliable?
The study should clearly state the primary and secondary outcomes of interest as well as the expected change for these outcomes. For example, if the primary outcome is improvement in visual acuity, the study should indicate the logMAR value that represents improvement, the range, the distribution of results (eg, normal, skewed to the right or left), and the variability. These statements allow the reader to determine whether the study was able to prove or disprove the null hypothesis.
Many outcomes (eg, visual acuity, intraocular pressure [IOP], macular thickness as measured with optical coherence tomography [OCT]) will have measurement error. This measurement error will increase the variability of the outcome of interest or create a difference in results when no true difference in outcomes exists. Therefore, a study should standardize measurement of the outcome of interest for all investigators. For example, the Ocular Hypertension Treatment Study created a standardized method to check IOP. The “recorder” placed the tonometry dial at 20 mm Hg while an “observer” measured the IOP and adjusted the dial to the intersection of the tonometry mires without viewing the dial. Finally, the recorder recorded the IOP measurement and changed the dial back to 20 mm Hg, then repeated the measurement sequence. The sequence was repeated a third time if the measurements differed by 2 mm Hg. By using a masked recorder and observer, and repeating the testing, the study created a standardized method intended to decrease measurement error and the variability of IOP measurement.
Was the follow-up time and reporting long enough?
The validity of a study is anchored on (1) adequate duration of follow-up and (2) follow-up of all participants. Thus, in evaluating a study, the clinician should look for how many of the participants completed follow-up and whether the study reported outcomes for all participants. For example, in a study assessing the use of atropine eyedrops versus patching for treatment of amblyopia, a follow-up of 3–6 months may be adequate; similar follow-up periods may be appropriate for tracking macular edema resolution after laser or drug therapy or monitoring visual acuity improvement after cataract extraction. Conversely, glaucoma progresses over long periods; therefore, trials assessing visual field loss in glaucoma would require longer follow-up, such as 5 years. Consequently, the typical rate of disease progression is an important guide in establishing the duration of follow-up required.
Finally, the study should report the results for all participants, which is called an “intention to treat” analysis. The study should state the reasons for loss to follow-up, and any differences in reasons between the study groups. For example, participants in the intervention group of a drug trial may be more likely to drop out than those in the placebo group if they experience ocular adverse effects from the drug, such as burning or stinging.
Is the analysis appropriate for the outcome?
Statistical tests depend on the type of data used to determine the difference between 2 treatment groups. For example, if the data are normally distributed (ie, parametric, conforming more or less to a bell-shaped curve) and are continuous (eg, age), then a t test comparing the intervention and control groups can be performed. For continuous data that are not normally distributed, researchers can use nonparametric tests such as the Mann-Whitney U test or the Wilcoxon signed rank test. For categorical data (present or absent; small, medium, or large), the study may use a chi-square test. All of these tests provide a P value, which is a number that indicates the likelihood that a difference between the 2 groups is due to chance alone. For example, a P value of <.05 suggests that the likelihood that the difference between the 2 groups is due to chance alone is less than 5%. The lower the P value, the less likely it is that the difference is due to chance and the more likely it is that the difference represents a true difference. Figure 1-1 shows a flow chart for various types of data and study designs.
Is the difference between the groups clinically significant?
Even though a statistical test may suggest a statistically significant difference in the results between 2 groups, the clinician should consider whether the magnitude and nature of the difference are clinically meaningful. For example, a statistically significant difference in visual acuity may only be 2 letters on a Snellen chart, but this difference may not be clinically noticeable to patients and may be within the margin of measurement error for visual acuity. In addition to evaluating primary outcome variables, the study should evaluate secondary clinically important variables related to the safety of the intervention. These variables include dropout rates, pain, and allergic reactions.
Is there a conflict of interest?
A conflict of interest (COI) occurs when a person or organization receives financial or other interests in an entity (eg, a device company) that could consciously or unconsciously motivate them to make decisions that benefit the entity. An example of a COI would be if a person who was a paid speaker for an entity wrote and published a paper describing that entity’s new device and its benefits. Because of the relationship between the author and the entity, there is a risk that the COI could positively influence the author’s decision-making in the entity’s favor. If the author of the paper biases the study results to describe large benefits for the device with minimal risk, the author may secondarily benefit by receiving more paid speaking engagements from the entity.
Although this COI and the author’s role in the paper may not represent an impropriety, such an impropriety is one reason that medical journals require authors and other decision makers to disclose any COIs. Secondarily and most importantly, medical journals also require the author and other individuals with COIs to present an accurate and balanced assessment of the benefits and all of the risks of the drug or new device. Even if their motivations do not include a direct financial benefit in the form of COI, all authors are motivated to publish their findings in research journals, for reasons that may include academic promotion, future research grants, and their national reputation. Overall, any research should include an accurate and balanced assessment of the results regardless of whether the authors have COI, and readers should use their best judgment about whether the research includes a balanced presentation of the results.
Excerpted from BCSC 2020-2021 series: Section 1 - Update on General Medicine. For more information and to purchase the entire series, please visit https://www.aao.org/bcsc.