Thomas, Ravi MD; Bhat, S. DO; Muliyil, J. P. DrPh; Parikh, R. DO; George, R. DO
Glaucoma is a major, worldwide cause of bilateral blindness. Because early detection and treatment may slow the rate of visual field loss and consequent blindness, there have been concerted efforts to develop screening methods for the disease. 1 The classic approaches to population-based glaucoma screening, such as tonometry, optic disc evaluation, assessment of the nerve fiber layer, and visual field tests, all have limitations (i.e., poor specificity). 2–4 Developing nations may lack the infrastructure to deal with the number of actual cases, let alone the possibility of false positives. In this situation, it may be sensible to focus on identifying, at least, the moderate and severe glaucoma cases. What is needed is a rapid, inexpensive, and accurate test to detect these cases.
Frequency doubling perimetry (FDP) is a relatively recent innovation that appears to fulfill these requirements. The basis of FDP and its sensitivity in detecting early field loss have been well described and several studies have reported high sensitivity and specificity for FDP in the detection of glaucoma. 5–7 The FDP literature suggests different criteria for screening, but there is no consensus on the most appropriate criteria. Quigley reported a sensitivity of 94% and a specificity of 91%, and found that quantification of a detected defect did not increase efficiency. 6 Robin et al. described a scoring system partly based on such quantification and reported a 96% sensitivity and a 90% specificity in identifying moderate and severe cases. 7
MATERIALS AND METHODS
Two groups were recruited.
Group I: Established Glaucomas (with Field Defects)
Eighty-five eyes of 85 glaucoma patients with established glaucomatous field defects as documented on the SITA standard 30–2 program of the Humphrey Field Analyzer (HFA, Humphrey Instruments, San Diego, CA) were included. Patients were diagnosed as having glaucoma based on intraocular pressure and characteristic disc findings with corresponding field defects that fulfilled at least two Anderson criteria. 8 Disc and field changes were mandatory for the diagnosis. The 85 fields were graded into mild (n =15), moderate (n =29), and severe (n =41) cases, using published criteria. 9
Patients with primary open- or chronic closed-angle glaucoma with best-corrected Snellen chart visual acuity of 6/9 or greater were eligible for inclusion. Patients with the following lens changes, graded according to the Lens Opacities Classification System (LOCS) III, were accepted: nuclear opalescence up to grade 3 (NO 1–3), nuclear color up to grade 3 (NC 1–3), and cortical up to grade 3 (C 1–3). Patients with posterior subcapsular cataract (P) in the pupillary area were excluded. Provided they fulfilled the inclusion criteria, patients with diabetes with mild nonproliferative retinopathy were also allowed. If both eyes of a patient fulfilled eligibility criteria, one eye was randomly selected for analysis. If only one eye was eligible, this eye was included.
The following were considered exclusion criteria: best-corrected vision less than 6/9, fellow eyes of chronic closed-angle glaucoma without field defects, proliferative diabetic retinopathy, those treated with laser photocoagulation, and cataracts considered responsible for best-corrected vision less than 6/9. Patients with cataract LOCS grades more severe than those mentioned above were excluded.
Group II: Control Subjects
Forty-eight eyes of 48 control subjects were included. All subjects had a best-corrected visual acuity of 6/9 or better, and had no ocular abnormalities on a complete ophthalmic examination. Subjects with refractive errors within plus or minus 6 diopters and presbyopes were included, as were pseudophakic patients and patients with diabetes without retinopathy, provided they fulfilled the other inclusion criteria. Eyes, graded according to LOCS III, with nuclear opalescence up to grade 3 (NO 1–3), nuclear color up to grade 3 (NC 1–3), and cortical up to grade 3 (C 1–3) were included. Eyes with posterior subcapsular cataract (P) in the pupillary area were excluded. Seven eyes of seven patients with cataracts responsible for vision between 6/36 and 6/12 were included among the controls for separate analysis. Patients with cataract LOCS grades more severe than those mentioned above were included in this category.
Patients who had abnormal fields, ocular abnormalities other than refractive errors and presbyopia, any neuroophthalmic disease causing field defects, or evidence of proliferative diabetic retinopathy were excluded.
All patients and control subjects underwent a complete ophthalmic examination, including refraction and best-corrected visual acuity, slit lamp examination, applanation tonometry, gonioscopy, indirect ophthalmoscopy and stereobiomicroscopic evaluation of the disc.
Baseline visual field testing was performed using the SITA Standard 30–2 program of the HFA for all participants and was repeated, if necessary, to obtain a reliable visual field. All participants were required to make three separate visits for the study purposes. FDP was performed as described subsequently:
Visit I: The following three tests were performed.
1. Screening program C20-1
2. Screening program C20-5
3. Full threshold N-30.
The patients initially underwent the C20-5 screening test, followed by the C20-1 screening test, and lastly full threshold testing. As part of a larger study, the patients underwent repeated testing a total of four times, the last after dilatation. Only the results of the first test are reported here.
The goal of screening is to detect disease with a minimum of false positives. An ideal screening test should have reasonably high sensitivity with very high specificity. This is referred to as the “best combination” criteria. The criteria developed followed a logical sequence from the most lenient to the most stringent and were based on the total number of abnormal points as well as the severity and contiguity of points. For example, using the C20-1 program:
Criterion 1 was the presence of one abnormal point anywhere in the field (including the central point), depressed to <1%, <0.5% P level. This was the most lenient criterion.
Criterion 2 required the presence of two abnormal points anywhere in the field (including the central point), with at least one point depressed to <1% P level and the other to <1%, <0.5% P level, or the most severe level.
Criterion 3 required the presence of two adjacent, abnormal points anywhere in the field (including the central fixation point), with at least one point depressed to <0.5% P level.
Criterion 4 required the presence of any three abnormal points depressed to any level.
Similar criteria were used for the C20-5 test.
The sensitivities and specificities, using each of the criteria, were calculated using a 2×2 table. Sensitivity and specificity were also calculated for the criteria recommended by Quigley (any two abnormal points of any severity) anywhere in the field. To make the results comparable to the data shown in Quigley, we recalculated results using the selection criteria described by in that study. 6 On the basis of a forthcoming article from our department, pattern standard deviation (PSD) on SITA was considered equivalent to corrected pattern standard deviation (CPSD) for the full threshold test. 10
We also determined the sensitivity and specificity by applying the scoring system described by Patel, et al. for the C20-1 screening program. 7 Abnormal points were graded based on location and severity, as follows:
Location: The outer 12 peripheral points more likely to be falsely positive were scored 1, the inner 4 points were scored 3, and the central fixation point was assigned a value of 5.
Severity: Each point was graded from 0 to 3 based on the depth of the defect. Normal areas were assigned a value of 0. Involvement at the 1% (mild defect) probability level scored 1, at the 0.5% level scored 2 (moderate defect) and the most severe involvement scored 3. These probability values are indicated on the printout at the defective points.
This score of 0–3 was multiplied by the weighting factor for the location, 1, 3, or 5, for each of the points involved. The final score was obtained by totaling the scores of all the abnormal points. The maximum score possible, if all the 17 points were involved at the most severe level, was 87.
A score of 2 could be achieved if two points in the outer periphery were involved at the 1% (mild) level: 1 × 2 × 1 = 2 or if one point in the outer periphery was involved at the 0.5% level: 1 × 1 × 2 = 2.
The Patel et al. scoring system is such that, a score of less than 2 cannot be achieved if two or more points were involved in the periphery or even a single inner point was involved regardless of its severity. 7
Positive predictive values and negative predictive values were calculated for the C20-1 test from the table used for calculation of sensitivity and specificity. 11 The calculation was performed for the “best combination” criteria for presumed disease prevalence of 5%, 30%, and 50%.
To determine the effect (if any) of cataract on the validity of the test, the seven eyes of the seven patients with cataract were introduced into the specificity calculation. The likelihood ratio for a positive test was calculated with and without the cataracts in the control group. 11 This was determined for our criteria (as well as the Robin et al. score) with the best sensitivity and specificity.
The demography of the groups is shown in Table 1. The mean age for the glaucoma group was 53.1 years old (range, 34–74 years) and was 45.6 years old (range, 25–68 years) for the control group. This difference is significant. There was no statistically significant difference in sex distribution between the groups.
C20-5 Screening Test
For the full case mix, the “best combination” of sensitivity and specificity for the C20-5 using the formulated criterion (one abnormal point anywhere in the field, depressed to <1% level) for test one was 88.2% and 87% respectively. The corresponding sensitivity for moderate and severe glaucoma using the same criterion was 90.6% at the same specificity. Because the C20-1 performed better, the remaining report is restricted to this program.
C 20-1 Screening Test
The sensitivities and specificities for the full mix of glaucomatous eyes, using the formulated criterion, for test one are tabulated (Table 2). The criterion of any two abnormal points anywhere in the field had the “best combination” of sensitivity and specificity: 81.2% (95%CI, 70.9–88.5) and 95.1% (95% CI, 84.6–99.3). The sensitivity and specificity for moderate and severe glaucoma, using the same criterion was 85.7 and 95.1%, respectively.
Because the criterion was essentially the same used by Quigley (two abnormal points on the FDP), the sensitivity and specificity were the same. When cases were selected for this study, using the selection criteria reported by Quigley, and the PSD was used to redistribute cases as described, the results obtained improved marginally to 82.4%.
The results of C20-1, using the scoring system reported in Robin et al., are shown in Table 3.
The best sensitivity was for a score of 2 or more: 85.9% (CI, 76.2–92.1) with a specificity of 95.1% (CI, 84.6–99.3). The results for moderate and severe cases are shown in Table 4. The same criteria provided a sensitivity of 91.8% (95% CI, 83.2–96.3). Our criteria did not fare as well as the scoring system.
Effect of Cataract on the Specificity of the Test
The specificity with and without cataracts in the control eyes for all the criteria and the different test strategies were determined. Using the Robin et al. criteria, the likelihood ratio for a positive test (without cataract) was 22.1. The likelihood ratio for a positive test (with cataract) was 4.9. The corresponding values for the moderate and severe group were estimated to be 22.9 and 5.4, respectively. The positive and negative predictive values are shown in table 4.
Frequency doubling perimetry has shown promising results as a potential screening test for glaucoma. 5–7 Quigley et al. used the 20-1 screening program on an early version of the machine and reported a sensitivity and specificity of 91%(CI 77.8–98.9) and 94% (CI 75.2–97.7), respectively. Using the same cut off (two abnormal points anywhere in the field), we obtained a sensitivity and specificity 81.2% and 95.8%, respectively. The sensitivity improved marginally (82.4%, CI 72.3–89.5) after reanalysis based on Glaucoma Hemi Field Test (GHT), as described by Quigley. 6 While we failed to achieve the sensitivity obtained by Quigley, that series was small and the confidence limits overlap our point estimate. The difference in results may be explained on the basis of sample size.
This study tested various criteria, but could not improve on the results using Patel et al. scoring system. 7 Using this scoring system, the results in a mix of early, moderate, and severe cases were similar to those reported. When we analyzed moderate and severe cases, using the same classification as Patel et al., our sensitivity was 91.8% with a specificity of 95.1%. The sensitivity obtained by Patel et al. for moderate and severe cases were 96% (CI 91–100), specificity was 95.1% (CI 90–100). The sensitivity and specificity obtained for the full case mix was very similar to that reported by Patel et al. Similar to the above report, the mean age of our normal group was also significantly less (7 years) than the glaucoma group. However, as the sensitivity loss with FDP is about 0.6 dB per decade 12, this age difference is unlikely to have influenced the results.
The small difference in sensitivity can be explained on the basis of confidence intervals. The higher sensitivity obtained by Patel et al. might also be possible if for some reason the moderate and severe cases in that study were actually more severe than those in this study. The Patel et al. report came out of Baltimore, USA and because India is a developing country, the study populations were so different that we would expect the reverse. On the other hand, Patel et al. did have a 26% black population, which is known to present late and have more severe cases. 7 The mean deviation of the three groups in Patel et al. would have allowed us to comment further on this.
The Patel et al. scoring system was based on location as well as severity. When we eliminated the scoring for severity, we still obtained the same results. This reinforces the statement made by Quigley that further quantification of the defect, beyond the 1% level, is not efficient. Eliminating quantification from the test would increase the test speed and add to the value of this test as a screening device.
To calculate predictive values, we used the “best combination” criteria. If prevalence of glaucoma in a clinic situation is considered to be 30%, the positive predictive value would be 94.5% and the negative predictive value 86.4%. If the prevalence is more like that found in a population-based screening (5% if we target older patients), the PPV becomes 47.2% and the NPV 99.2%. The same values for moderates and severe grades are 48.9% and 99.4%, respectively.
Glaucoma usually occurs in older age groups, in which members of the population usually have some associated lenticular changes. Because the available literature indicated that media opacities do not interfere with the test, this study included a small population of cataracts (accounting for visual loss up to 6/36) in the control group. On analysis, we found that the inclusion of cataracts decreased the specificity quite markedly. When these patients were removed from the analysis, the specificity improved. While the number of cataracts included was only seven, our results suggest that media opacities (i.e., mild cataract), may affect interpretation of FDP. Certainly, specificity of FDP in a screening scenario should be less than in our highly selected control subjects.
In summary, FDP has a high sensitivity and specificity for detecting field defects in our population. FDP can certainly aid us in selecting those patients who actually require testing using automated perimetry. We have validated the scoring system described by Patel et al. but suggest that quantification of a defect may be avoided.
© 2002 Lippincott Williams & Wilkins, Inc.