Secondary Logo

Journal Logo

Advances in Glaucoma: Review Article

Diagnostic Accuracy of Artificial Intelligence in Glaucoma Screening and Clinical Practice

Chaurasia, Abadh K. MOptom*; Greatbatch, Connor J. B Med Sci (Hons) MBBS*; Hewitt, Alex W. MBBS, PhD, FRANZCO*,†

Author Information
doi: 10.1097/IJG.0000000000002015

Abstract

Glaucoma is a leading cause of irreversible blindness worldwide; it is projected to affect over 111 million people by 2040.1 The current global estimates show that ~65 million people suffer from primary open-angle glaucoma (POAG), accounting for ~3-quarters of all glaucoma cases.2,3 Aging is one of the established risk factors for disease development, and it is projected that the number of people over 65 years of age will approach 1.5 billion by 2050.4–6 Thus, there will be a major disease burden over the coming decades. The socioeconomic costs of glaucoma increase ~4-fold between early-stage to end-stage disease; hence timely diagnosis and intervention save sight and healthcare resources.7 In 2004, the direct medical cost of glaucoma was US$2.9 billion annually in the United States and is anticipated to increase to US$12 billion by 2032.8,9 As such, there is a pressing need for the development of robust diagnostic and risk stratification modalities for POAG.

Leading diagnostic imaging technology of spectral-domain optical coheren ce tomography (SD-OCT) and high-resolution fundus photography have revolutionized glaucoma diagnosis and management. Although fundus photography is a cheap means to assess glaucomatous optic nerve head (ONH) changes over time,10 OCT has been shown to detect retinal nerve fiber layer (RNFL) degeneration earlier than other structural or functional modalities and with high sensitivity and specificity.11,12 Nevertheless, the assessment of these images presents as a bottleneck in clinical profiling at the population level. As such, many recent studies have focused on developing artificial intelligence (AI) algorithms for glaucoma detection.

Population-based glaucoma screening is not cost-effective as definite diagnosis requires experienced ophthalmologists, multiple clinical investigations, and long-term follow-up.13–15 AI models could potentially distinguish between patients with and without glaucoma through imaging modalities16; however, these tools are yet to be deployed into routine ophthalmic practice. We conducted a systematic review and meta-analysis to investigate and determine AI performance on ophthalmic images for glaucoma diagnosis, and to identify factors affecting potential implementation into clinical practice. Our primary objective was to calculate overall AI algorithms’ performance for glaucoma detection through OCT and fundus imaging modalities. Furthermore, we explored whether AI performance is sufficient to apply the technology in clinical practice. Our evidence-based findings allow for the development of more effective AI models for glaucoma diagnosis.

METHODS

Protocol and Registration

The study was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis of Diagnostic Test Accuracy (PRISMA-DTA) extension.17Figure 1 outlines the study selection, and the PRISMA-DTA checklist is contained in Supplementary Table 1 (Supplemental Digital Content 1, https://links.lww.com/IJG/A602). The protocol of this review was registered in the International Prospective Register of Systematic Reviews (PROSPERO ID: CRD42021273139). Ethics approval was not required for this study as all included data were available in the public domain. Research questions were based on participants (fundus and OCT images), index test (AI models), comparison, reference standard (clinical tests; visual fields or combined clinical tests), and target condition (glaucoma).17

F1
FIGURE 1:
PRISMA 2020 flow diagram for study selection.

Eligibility Criteria

Types of Studies Design

Case-control or cross-sectional studies were eligible for inclusion if they compared the diagnostic performance of AI between healthy and glaucomatous eye scans. Manuscripts were excluded if the study was conducted for angle-closure glaucoma. We also excluded the reviews, short communications, and studies with unavailable full text.

Types of Participants (Image Data)

Studies were included if clinicians graded participants’ fundus photographs or OCT images as healthy or glaucomatous.

Index Tests

AI algorithms used for glaucoma diagnosis, detection, screening, or progression were included—performance metrics were reported as the area under the receiver operating characteristic curve (AUC) or accuracy with sensitivity and specificity. Studies were excluded if clinical parameters such as patient demographics, visual field results, intraocular pressure or other imaging modalities (eg, intravenous fluorescein angiography) were used to train and evaluate AI models.

Reference Standards

The reference standard for the evidence of glaucomatous fundus photographs or OCT images was based on visual fields or combined clinical tests results.

Information Sources and Search Strategies

We systematically searched multiple databases (Embase, Medline, Web of Science, and Scopus) via the University of Tasmania library until August 7, 2020, without language restrictions. The most appropriate keywords were shortlisted for the search as follows: (“Glaucoma” OR “retinal nerve fiber layer” OR “optic disc” OR “cup to disc ratio” OR “visual field” OR “Optical coherence tomography “) AND (“artificial intelligence” OR “machine learning” OR “deep learning” OR “transfer learning” OR “Convolutional Neural Network” OR “computer-aided diagnosis”). Full details of search criteria and literature retrieved from each database are described in Supplementary Table 2 (Supplemental Digital Content 1, https://links.lww.com/IJG/A602).

Study Selection

The relevant manuscripts were screened and selected based on title and abstract by one reviewer (A.K.C.) and verified independently by another reviewer (C.J.G.). The full texts of all potentially included studies were retrieved for this review. We also searched the bibliography of primary articles and reviews to identify potentially eligible studies missed by the database search.

Data Collection, Extraction, and Risk of Bias Assessment

The relevant data were extracted from the selected articles based on a standardized protocol. The following outcomes were measured from each study: (1) author(s), published year, and journal; (2) AI approach [traditional Machine Learning (ML) and/or Deep Learning (DL)]; (3) highest performing AI architectures; (4) classifier choice, feature extraction techniques, and use of transfer learning; (5) input image size; (6) data set size (healthy and glaucomatous) and split ratio (train, validation, test); (7) types of data sets (public and private); (8) data set origin; (9) imaging modality (fundus photography and OCT images); (10) monoscopic or stereoscopic images; (11) instrument used for imaging; (12) internal or external data validation; (13) model performance reported as AUC, accuracy, sensitivity, and specificity; (14) patient demographics; (15) applied techniques (segmentation based and non–segmentation-based); (16) region of interest selection (eg, optic disc, optic disc and cup) as described in Supplementary terminology (Supplemental Digital Content 1, https://links.lww.com/IJG/A602); and (17) glaucoma grading protocol used.

Each study was evaluated for potential sources of bias and applicability by appraising the following 4 key domains outlined in the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2): patient selection, index test, reference standard, and flow and timing (Supplementary tool, Supplemental Digital Content 1, https://links.lww.com/IJG/A602).18–20 The QUADAS-2 signaling questions were tailored to assess the risk of bias and concerns regarding applicability through the key domains. The bias risk was defined as either low, high or unclear based on the signaling questions.

Diagnostic Accuracy Measures

Data regarding the sensitivity and specificity were extracted, and contingency tables were constructed as appropriate for each study.18 We visualized individual study findings by plotting the estimates of sensitivity and specificity in both the area under the summary receiver operating characteristic (SROC) curve and forest plots.

Synthesis of Results

Sixty-six studies were deemed eligible for qualitative synthesis; we calculated pooled AUC, diagnostic odds ratio (DOR), sensitivity, and specificity with 95% confidence intervals (CI) for overall diagnostic performance of AI in glaucoma detection through 2 imaging modalities; fundus and OCT images. Further, subgroups analyses were accomplished to determine the factors affecting the performance of AI algorithms. We compared the diagnostic performance in fundus and OCT images. Also, we evaluated the diagnostic performance in monoscopic, stereoscopic, and mixed fundus images. Furthermore, to ascertain the best approach for glaucoma assessment, we compared the applied techniques used in glaucoma diagnosis: segmentation and non–segmentation-based with deep learning or machine learning approach. AI considered the final diagnosis either based on full ONH image or optic disc (OD) or OD and optic cup (OC) or RNFL thickness as a region-of-interest (ROI) selection—we determined the best ROI for glaucoma diagnosis. We also investigated the potential of AI models ready for clinical deployment by comparing the validation accuracy on internal and external data. Lastly, we compared the diagnostic performance of AI on private, public, and mixed data sets to determine the quality (standard grading) and type of data set (homogeneity) was used.

Meta-Analysis

Statistical analyses were performed in R (version 4.0.4) using the mada and meta packages.21–23 A shared analysis of sensitivity and specificity was conducted to acquire summary estimates for meta-analysis of diagnostic test accuracy using the hierarchical model with 95% CIs.24,25 A bivariate random-effects model was used to calculate the area under the SROC curve, and the AUC metric.26 In addition, the pooled sensitivity, specificity, and DOR were estimated using a random-effects model (DerSimonian-Laird).22 We compared the diagnostic performance with a single diagnostic accuracy measure of AUC.27,28 Study heterogeneity was measured using Higgins inconsistency index (I2), Tau-squared, and Cochran Q value (significant heterogeneity if P<0.05)29 of DOR (Supplementary Tables 3 and 4, Supplemental Digital Content 1, https://links.lww.com/IJG/A602). The presence of publication bias was examined using Deeks’ funnel plot test.30 Heterogeneity between subgroups was also visualized with forest plots of sensitivity analysis, specificity analysis, and DOR in Supplementary forest plots (Supplemental Digital Content 1, https://links.lww.com/IJG/A602).

RESULTS

Study Selection

A total of 2699 articles were initially identified across the databases searched. We removed 1091 duplicate studies, leaving 1608 studies which were screened based on title and abstract. Following this, the full text of 219 articles were retrieved and evaluated as per the inclusion and exclusion criteria, with 79 articles being included for qualitative synthesis; 13 studies that had insufficient information to reconstruct the contingency table were excluded, leaving a total of 66 studies to be included in the meta-analysis. Details regarding study selection are outlined in Figure 1.

Study Characteristics

All the selected studies (79) used AI algorithms to diagnose glaucoma based on fundus or OCT imaging. Across all studies, 886,019 fundus images and 37,901 OCT images were analyzed in total. DL was utilized in 68% of studies, with the remaining using traditional ML techniques. The studies’ characteristics and dataset descriptions are outlined in Tables 1–3. Most of the studies used private data (67.1%) to train the AI models and evaluate the diagnostic performance. Many authors used the same publicly available datasets, including RIM-ONE (56.67%) and DRISHTI-GS (36.67%). The range of AI performance measured in AUC ranged from 79.2% to 99.9% across the studies. However, only 13% utilized external validation to evaluate the model performance as per the study’s inclusion and exclusion criteria (Tables 2, 3).

TABLE 1 - Description of the Datasets (images) Used for Glaucoma Diagnosis
References Data Sets Country of Data Sets Instrument Used for Imaging Total Training Total Validation Total Testing Input Image Size
Abidi et al31 Private Canada HRT scanner 943 314
Acharya et al32 Private India ZEISS-FF450 240×180×3
Acharya et al33 Private India ZEISS-FF450 plus 459 51 240×180×3
Al-Aswad et al34 Private The Netherlands Topcon TRC-SS2 (Topcon corporation, Tokyo, Japan) the Topcon TRC-NW8 (Topcon, Japan), Canon CR2 (Canon USA), and Centervue Eidon (CenterVue, Italy) 125934 3761 224×224×3
Al-Bander et al35 RIM-ONE v2 Spain 318 137 227×227×3
Asaoka et al36 Private Japan Topcon OCT-1000 or OCT-2000 machine (Topcon Corporation, Tokyo, Japan) 178 196 8×8×3
Bajwa et al37 ORIGA, HRF, OCT, CFI Singapore, The Netherlands, Iran 525 48 207 227×227×3
Balasubramanian et al38 HRF, RIM-ONE V3 The Netherlands, Spain Canon CR-1 fundus camera, Fundus camera Nidek AFC-210 with a body of a Canon EOS 5D Mark II 100×5×3
Bock et al39 Private (EGR) Germany Kowa NonMyd alpha digital fundus camera 128×128×3
Chakrabarty et al40 Private India Topcon TRC50EX fundus camera and Zeiss Visucam NM/FA fundus 1926 314 -
Chan et al41 Private Singapore Optovue 300×300×3
Chang et al42 Private Korea CR-2; Canon Inc., Tokyo, Japan 4867 544 605 224×224×3
Cheriguene et al43 RIM-ONE V3 Spain Fundus camera Nidek AFC-210 with a body of a Canon EOS 5D Mark II
Christopher et al44 Private (DIGS/ADAGES) USA Nidek Stereo Camera Model 3-DX (Nidek Inc., Palo Alto, CA), Nonmyd WX fundus camera 14063 759 224×224×3
Christopher et al45 Private (ADAGES, UCSD) USA Nidek Stereo Camera Model 3-DX 12599 741 1482 224×224×3
Civit-Masot et al46 RIM-ONE v3, DRISHTI Spain, India Fundus camera Nidek AFC-210 with a body of a Canon EOS 5D Mark II 175 75 224×224×3
Diaz-Pinto et al47 HRF, Drishti-GS1, RIM-ONE v2, sjchoi86-HRF, ACRIMA The Netherlands, India, Spain Canon CR-1, Topcon TRC retinal camera and IMAGEnet R capture System 299×299×3
Diaz-Pinto et al48 ORIGA-light, Drishti-GS1, RIM-ONE, sjchoi86-HRF, HRF, DRIVE, MESSIDOR, DR KAGGLE, STARE, e-ophtha, ONHSD, CHASEDB1, DRIONS-DB, SASTRA, ACRIMA Multiple 1650 707 128×128×3
Elakkiya et al49 ORIGIA, RIM-ONE, DRISHTI-GS1 Singapore, Spain, India Fundus camera Nidek AFC-210 with a body of a Canon EOS 5D Mark II 2661 761 380 224×224×3
Elseid et al50 RIM-ONE V1 Spain 256×256×3
Ferreira et al51 RIM-ONE v3, DRIONS-DB, DRISHTI- GS Spain, India Fundus camera Nidek AFC-210 with a body of a Canon EOS 5D Mark II 304 76 580×420×3
Fu et al52 Private China Zeiss Visucam 500 and Canon CR-2 machines 800 400 400×400×3
Fu et al53 ORIGA Singapore 224×224×3
Gaddipati et al54 Private India, The Netherlands Spectral-domain RTVue—XR 100 OCT, Spectralis OCT 165 38 50 200×304×102
Gomez-Valverd et al55 RIM-ONE, DRISHTI-G, ESPERANZA Spain, India Fundus camera Nidek AFC-210 with a body of a Canon EOS 5D Mark II, Nidek AFC-210, Canon EOS 5D 1560 174 579 224×224×3
Haleem et al56 Private USA Optos P200MA
Hemelings et al57 Private Belgium Zeiss VISUCAM (Carl Zeiss Meditec, Jena, Germany) 4935 679 1424 224×224×3
Issac et al58 Private India Welch Allyn PanOptic Ophthalmoscope, (Model no 11820)
Jiang et al59 ORIGA Singapore
Karkuzhali and Manimegalai60 DRISHTI India 50 51
Kausu et al61 Private India ZEISS-FF450 plus with VISUPAC from ZEISS 300×300×3
Kim et al62 Private Korea 1503 400 220 299×299×3
Kim et al63 Private Korea Cirrus SD-OCT 7288 1700 176×176
Kim et al64 RIGA USA 224×224×3
Kirar et al65 MIAG (RIM-ONE V1/V3) Spain Fundus camera Nidek AFC-210 with a body of a Canon EOS 5D Mark II 256×256×3
Kishore and Ananthamoorthy66 HRF, DRIVE The Netherlands, Germany Canon CR5 nonmydriatic 3CCD camera, Canon CR-1 fundus camera 70 30
Ko et al67 Private (TVGH) Taiwan Canon CR-DGi NM fundus camera, Canon CX-1 hybrid mydriatic/nonmydriatic digital retinal camera, and Canon CR-2 PLUS AF nonmydriatic retinal camera 763 181 256×256×3
Lee et al68 Private Korea (Cirrus, version 6.0; Carl Zeiss Meditec) 331×331×3
Lee et al69 Private Korea (Cirrus, version 6.0; Carl Zeiss Meditec) 460 197 331×331×3
Lee et al70 Private Korea Vx-10; Kowa Optimed Inc., Tokyo, Japan 140 60 331×331×3
Li et al71 Private (LAG) China Topcon, Canon and Carl Zeiss 10928 832 224×224×3
Li et al72 Private (LabelMe) China 31745 8000 299×299×3
Li et al73 Private China Zeiss Visucam 500, Canon CR2 20,793 2311 3481 256×256×3
Liu et al74 Private, RIM-ONE Australia, Spain 3200 800 512×512×3
Liu et al75 REFUGE China Zeiss Visucam 500 and Canon CR-2 300 100 256×256×3
Liu et al76 Private (CGSA) China Topcon, Canon, Carl Zeiss 241 032 285 69 224×224×3
MacCormick et al77 ORIGA Singapore 520 130
Maheshwari et al78 Private India 360×480×3
Martins et al79 ORIGA, Drishti-GS, iChallenge, RIM-ONE r3, RIGA Singapore, India, Spain, Canada 224×224×3
Medeiros et al80 Private (Duke Glaucoma Repository) USA Nidek 3DX, Nidek, Japan 26,528 6,292 256×256×3
Medeiros et al81 Private (Duke Glaucoma Repository) USA Nidek 3DX, Visupac FF-450 52657 33466 256×256×3
Mukherjee et al82 Private India
Mvoulana et al83 DRISHTI-GS1 India
Norouzifard et al84 Private (UCLA) USA 313 119 45 299×299×3
Oh et al85 Private Korea KOWA VX-10; Kowa Company Ltd., Tokyo, Japan
Patil et al86 DRION-DB, DRISHTI-GS Spain, India Canon CR-2 device 210 100 256×256×3
Phene et al87 Private USA, UK, India 86 618 1508 587×587×3
Abbas88 DRIONS-D, sjchoi86-HRF, HRF-dataset, PRV-Glaucoma Spain, Saudi Multiple cams 480 720 300×300×3
Raghavendra et al89 Private India Zeiss FF 450 998 428 64×64×3
Raja et al90 RIM-ONE V3 Spain Fundus camera Nidek AFC-210 with a body of a Canon EOS 5D Mark II 512×512×3
Rajan and Ramesh91 Private 64×128×3
Ran et al92 Private China, Hong Kong Cirrus HD-OCT 2927 975 975 320×256×3
Rao et al93 Private India
Renukalatha et al94 RIM-ONE v3 Spain Fundus camera Nidek AFC-210 with a body of a Canon EOS 5D Mark II 100×100×3
Rogers et al95 Private (EODAT) The Netherlands Topcon TRC-SS2 (Topcon corporation, Tokyo, Japan) the Topcon TRC-NW8 (Topcon, Japan), Canon CR2 (Canon USA), and Centervue Eidon (CenterVue, Italy). 125934 3761 224×224×3
Salam et al96 Private Pakistan
Sathiya et al97 Private India
Serener et al98 Public image dataset (Unclear) Korea Fundus camera (AFC-330, Nidek, Japan). 754 324 464 256×256×3
Sharma et al99 DRISHTI, REFUGEE India, China Fundus camera (AFC-330, Nidek, Japan), Zeiss Visucam 500, Canon CR-2 device 251 40 256×256×3
Singh et al100 Private India The digital photography eye camera 44 19
Soorya et al101 Private India
Thompson et al102 Private (Duke Glaucoma Repository) USA Spectralis SD-OCT(version 5.4.7.0.; Heidelberg Engineering) 10404 4162 6340 496×496×3
Ting et al103 Private Singapore FundusVue, Carl Zeiss, Topcon, Canon 125189 71896 512×512×3
Touahri et al104 RIM-ONE v2 Spain 210 90
Wang et al105 Private USA Cirrus 4000, Zeiss 177 44
Yang et al106 Private Korea 900 240 2675 256×256×3
Zapata et al107 Private (Optretina’s tagged dataset) European Topcon NW 400, 300, 6S, 3D maestro, NW8, 3D 2000, NW 200, and Zeiss Visuscout100 and Nidek and others 3776 944 224×224×3
Zheng et al108 Private China, Hong Kong Topcon 3D OCT-2000 1501 102 224×224×3
Zilly et al109 RIM-ONE v3 Spain Fundus camera Nidek AFC-210 with a body of a Canon EOS 5D Mark II, Nidek AFC-210 128 31
Fundus indicates retinal fundus image; OCT, optical coherence tomography; Private, authors used their data or not publicly accessed data for the train and evaluate the model; Total training, number of images used to train the model; Total validation or total testing, number of images used to evaluate the model.

TABLE 2 - Characteristics of Studies: Traditional Machine Learning in Glaucoma Diagnosis
References Best Classifier Feature Extraction Techniques Total Data Set TP TN FP FN Imaging Modality Validation
Abidi et al31 Bayesian network Zernike moments 1257 NA NA NA NA OCT Internal
Acharya et al32 K-NN Texton and local configuration pattern (LCP) 702 219 253 10 25 Fundus Internal
Acharya et al33 SVM Gabor features and PCA 510 538 134 9 21 Fundus Internal
Balasubramanian et al38 K-NN Homogeneity and correlation/PCA 214 49 50 0 1 Fundus Internal
Bock et al39 SVM PCA 575 174 286 50 65 Fundus Internal
Chan et al41 Adaboost Local phase quantization (LPQ) technique with PCA 411 99 288 18 6 OCT Internal
Cheriguene et al43 TW-SVM Co-occurrence matrix, Hu moments and central moments 169 51 111 7 0 Fundus Internal
Elseid et al50 Ensemble RUSBoosted tree Gray level co-occurrence matrix (GLCM) 158 33 109 9 7 Fundus Internal
Haleem et al56 TW-SVM Regional image features model (RIFM) 189 17 43 4 2 OCT Internal
Issac et al58 SVM Adaptive thresholding 67 32 32 3 0 Fundus Internal
Kirar et al65 LS-SVM Discrete wavelet transforms (DWTs) and empirical wavelet transforms (EWTs) 505 216 206 49 34 Fundus Internal
Kishore and Ananthamoorthy66 Classifier Fusion Intraclass and extra-class discriminative correlation analysis (IEDCA) 100 11 15 0 4 Fundus Internal
MacCormick et al77 Spatial algorithm Hierarchical probabilistic 650 35 63 22 4 Fundus External
Maheshwari et al78 SVM Bit-plane slicing (BPS) and local binary pattern (LBP) 1426 827 587 2 10 Fundus Internal
Mukherjee et al82 SVM Template matching technique 101 62 13 7 19 Fundus Internal
Mvoulana et al83 2-D feature-based measurement Template matching technique/texture-based method 101 32 17 1 0 Fundus Internal
Oh et al85 Hough transformation/knowledge-based rules 198 84 75 25 14 Fundus Internal
Raja et al90 SVM Hyper analytic wavelet transformation (HWT) 158 65 78 6 9 Fundus Internal
Rajan and Ramesh91 SVM Discrete wavelet transform (DWT) 200 30 29 3 3 OCT Internal
Rao et al93 Adaboost Genetic algorithm (GA) 510 131 334 26 19 Fundus Internal
Renukalatha et al94 Sim-MSVM (simplified multiclass SVM) Adaptive histogram threshold 100 46 48 2 4 Fundus Internal
Salam et al96 SVM PCA 150 24 73 1 2 Fundus Internal
Sathiya et al97 SVM Tetrolet transform (TT) 200 10 10 0 0 OCT Internal
Singh et al100 SVM PCA 63 32 28 2 1 Fundus Internal
Soorya et al101 Adaptive thresholding, geometrical features 364 30 217 8 2 Fundus Internal
External indicates external data used for validation accuracy; FN, false negative; FP, false positive; Fundus, retinal fundus image; Internal, intern data used for validation accuracy; K-NN, K-nearest neighbor; LS-SVM, least-squares support-vector machines; NA, not available; OCT, optical coherence tomography; PCA, principal component analysis; SVM, support vector machines; TN, true negative; Total data set, total number of images used for training and testing the model; TP, true positive; TW-SVM, twin support vector machine.

TABLE 3 - Characteristics of Studies: Deep Learning in Glaucoma Diagnosis
References Architecture Transfer Learning Total Data Set TP TN FP FN Imaging Modality Validation
Al-Aswad et al34 CNN ResNet-50 129695 42 44 6 8 Fundus External
Al-Bander et al35 CNN Alexnet (23 layers) 455 NA NA NA NA Fundus Internal
Asaoka et al36 CNN DL-Transform model (he used own pretrained the model) 374 94 77 5 20 OCT External
Bajwa et al37 RCNN (Regions with Convolutional Neural Network) VGG16 780 391 48 21 91 Fundus Internal
Chakrabarty et al40 Cascade-forward Neural Network 2240 121 104 41 48 Fundus Internal
Chang et al42 CNN ResNet50 6016 100 270 30 0 Fundus Internal
Christopher et al44 CNN ResNet50 14822 NA NA NA NA Fundus External
Christopher et al45 CNN ResNet50 14,822 52 47 2 9 Fundus Internal
Civit-Masot et al46 CNN MobileNet v2 250 97 101 22 12 Fundus Internal
Diaz-Pinto et al47 CNN Xception 1707 154 209 52 40 Fundus Internal
Diaz-Pinto et al48 GAN-SS-DCGAN (Deep Convolutional Generative Adversarial Network ) 86926 238 335 85 49 Fundus Internal
Elakkiya et al49 CNN and RNN Inception-v3 3802 NA NA NA NA Fundus Internal
Ferreira et al51 Encoder-Decoder -U-net 380 189 81 0 0 Fundus Internal
Fu et al52 Encoder-Decoder -U-net 1200 39 1366 264 7 Fundus Internal
Fu et al53 CNN-Disc-aware Ensemble Network (DENet)_U-Net ResNet-50 650 34 307 53 6 Fundus External
Gaddipati et al54 CNN-Capsule Network (3-D) 253 16 31 1 2 OCT Internal
Gomez-Valverd et al55 CNN VGG-19 2313 108 329 41 16 Fundus Internal
Hemelings et al57 CNN ResNet-50 7038 1011 346 25 42 Fundus Internal
Jiang et al59 CNN-JointRCNN 650 33 1467 163 13 Fundus External
Karkuzhali and Manimegalai60 Neural Networks 101 13 13 0 0 Fundus Internal
Kausu et al61 CNN 86 34 50 1 1 Fundus Internal
Kim et al62 CNN ResNet-152-M 2123 289 102 2 32 Fundus Internal
Kim et al63 CNN VGG-19 8988 157 55 0 8 OCT Internal
Kim et al64 Fully convolutional networks (FCN)-U-Net 699 NA NA NA NA Fundus Internal
Ko et al67 CNN VGGNet-16 944 86 86 5 4 Fundus Internal
Lee et al68 CNN NASNet- A 161 NA NA NA NA OCT Internal
Lee et al69 CNN NASNet-A 657 NA NA NA NA OCT Internal
Lee et al70 CNN NASNet-A 200 NA NA NA NA Fundus Internal
Li et al71 AG-CNN (Based on Residual networks) 11760 334 466 16 16 Fundus Internal
Li et al72 CNN Inception-v3 39745 1880 5550 483 87 Fundus Internal
Li et al73 CNN ResNet101 26,585 1405 1410 114 37 Fundus Internal
Liu et al74 CNN ResNet50 4364 13 13 2 2 Fundus External
Liu et al75 Generative Adversarial Nets (GANs)-(semi-supervised) 400 7 86 4 3 Fundus Internal
Liu et al76 CNN ResNet-(GD-CNN) 274413 2786 25085 588 110 Fundus Internal
Martins et al79 CNN-GFI-C MobileNetV2 (GFI-C) 2482 600 1563 213 106 Fundus Internal
Medeiros et al80 CNN ResNet34 32,820 12868 13227 3307 4064 Fundus Internal
Medeiros et al81 CNN ResNet50 86123 1450 4165 219 458 Fundus Internal
Norouzifard et al84 CNN InceptionResNet-V2 477 14 14 1 1 Fundus External
Patil et al86 Stacked Auto-Ender (SAE) -GlaucoNet (Glaucoma Detection and Classification) Binary cross-entropy (BCE) 210 70 29 2 0 Fundus Internal
Phene et al87 CNN Inception-v3 88126 NA NA NA NA Fundus External
Abbas88 CNN-Glaucoma-Deep/softmax 1200 NA NA NA NA Fundus Internal
Raghavendra et al89 Sparse Autoencoder (Unsupervised learning) 1426 239 169 8 12 Fundus Internal
Ran et al92 CNN ResNet-34 4877 2604 1873 78 322 OCT Internal
Rogers et al95 CNN ResNet-50 129695 39 40 6 9 Fundus External
Serener et al98 CNN GoogLeNet 1542 2 116 2 10 Fundus Internal
Sharma et al99 CNN 4-layers 541 20 18 2 0 Fundus Internal
Thompson et al102 CNN ResNet34 20806 2611 2321 122 1286 OCT Internal
Ting et al103 CNN VGGNet 197 085 NA NA NA NA Fundus Internal
Touahri et al104 CNN 3-layer CNN 300 NA NA NA NA Fundus Internal
Wang et al105 CNN ResNet-18 221 NA NA NA NA OCT Internal
Yang et al106 CNN ResNet-50 3815 98 2495 5 1 Fundus Internal
Zapata et al107 CNN ResNet-50 (modified) 4,720 368 397 76 104 Fundus Internal
Zheng et al108 CNN Inception V3—159 layers 1603 49 51 1 1 OCT Internal
Zilly et al109 CNN Ensemble learning 159 13 16 1 1 Fundus Internal
CNN indicates convolutional neural network; External, external data used for validation accuracy; FN, false negative; FP, false positive; Fundus, retinal fundus image; Internal, intern data used for validation accuracy; NA, not available; OCT, optical coherence tomography; TN, true negative; TP, true positive.

Risk of Bias and Applicability of Included Studies

We performed a quality assessment of included studies (n=66) using the QUADAS-2 tool (Supplementary Table 5, Supplemental Digital Content 1, https://links.lww.com/IJG/A602). The graphical presentation of the risk of bias and applicability concerns are displayed in Figure 2.

F2
FIGURE 2:
Graphical display of Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) results.

Publication Bias

Deeks’ funnel plot test was used to determine publication bias and illustrated asymmetry in the funnel plot (Supplementary Fig. 1, Supplemental Digital Content 1, https://links.lww.com/IJG/A602) with borderline significance (P-value=0.081, df=64, t=1.78). Publication bias is considered significant if P-value <0.10.110

Overall Meta-Analysis of the Diagnostic Performance of AI for Glaucoma Detection Through the Imaging Modalities

In total, 66 studies were analyzed to evaluate the AI models’ diagnostic performance through both fundus and OCT images. The overall pooled AUC, sensitivity, specificity, and DOR were 96.3%, 92.0% (95% CI: 89.0–94.0), 94.0% (95% CI: 92.0–95.0), and 108.88 (95% CI: 71.54–165.72), respectively. However, heterogeneity between studies is apparent in the SROC curve (a combined relationship between sensitivity and specificity at different thresholds) and forest plot of DOR in Figures 3 and 4.

F3
FIGURE 3:
SROC curve for overall diagnostic performance of AI. AI indicates artificial intelligence; CI, confidence interval; SROC, summary receiver operating characteristic.
F4
FIGURE 4:
Forest plot for imaging modalities: DOR. CI indicates confidence interval; DOR, diagnostic odds ratio.

Subgroup Analysis

Diagnostic Performance on Imaging Modalities: Fundus and OCT images

This study established that, to date, AI models have been primarily trained on fundus photos (85%) compared with OCT images. However, the overall pooled AUC on fundus and OCT images was similar at 96.2% and 96.0%, respectively. Moreover, when comparing monoscopic and stereoscopic fundus images, the AUC was 96.6% and 95.6%, respectively. AUC dropped to 91.4% when using mixed fundus images (Table 4). No studies evaluated AI models trained on both fundus photos and OCT combined. The SROC curve and forest plots for differing imaging modalities are provided in Supplementary Figure 2.2 and Forest Plots 2.0 (Supplemental Digital Content 1, https://links.lww.com/IJG/A602).

TABLE 4 - Summary Estimates for Overall AI Performance In Glaucoma Detection and Subgroups Analysis
Glaucoma Detection Studies (N) Pooled AUC (95% CI) Pooled Sensitivity (95% CI) Pooled Specificity (95% CI) Pooled DOR (95% CI)
1.0. Overall DA of AI-models 66 96.3 92.0 [89.0–94.0] 94.0 [92.0–95.0] 108.88 [71.54–165.72]
Subgroup analysis
2.0. All fundus images 56 96.2 92.0 [89.0–94.0] 93.0 [91.0–95.0] 83.73 [55.18–127.05]
 2.1. Mono fundus images 39 96.6 93.0 [90.0–95.0] 94.0 [91.0–96.0] 100.21 [54.22–185.21]
 2.2. Stereo fundus images 7 95.6 90.0 [80.0–96.0] 95.0 [94.0–96.0] 136.67 [113.16–165.06]
 2.3. Mixed fundus images (mono + stereo) 10 91.4 87.0 [73.0–95.0] 91.0 [84.0–95.0] 19.80 [14.76–26.55]
3.0. OCT images 10 96.0 90.0 [83.0–94.0] 95.0 [95.0–96.0] 382.31 [184.28–793.18]
4.0. Applied techniques
 4.1. ML with segmentation 13 95.3 92.0 [85.0–96.0] 93.0 [88.0–97.0] 91.07 [35.58–233.14]
 4.2. ML with nonsegmentation 11 96.9 94.0 [89.0–96.0] 94.0 [89.0–97.0] 178.02 [57.06–555.33]
 4.3. DL with segmentation 12 95.4 91.0 [82.0–96.0] 93.0 [86.0–97.0] 30.85 [10.05–94.65]
 4.4. DL with nonsegmentation 30 96.5 91.0 [87.0–94.0] 94.0 [91.0–96.0] 155.99 [86.36–281.77]
5.0. Region of interest (ROI)
 5.1. Full ONH images 30 97.3 93.0 [90.0–96.0] 95.0 [92.0–96.0] 178.71 [103.77–307.77]
 5.2. OD 10 96.6 94.0 [86.0–97.0] 95.0 [86.0–98.0] 149.23 [24.32–915.71]
 5.3. OD and OC 18 94.0 91.0 [85.0–95.0] 91.0 [87.0–95.0] 36.16 [17.60–74.29]
 5.4. RNFL 8 91.3 81.0 [75.0–86.0] 93.0 [87.0–96.0] 147.93 [37.22–587.90]
6.0. Validation accuracy
 6.1. Internal data 57 96.7 93.0 [90.0–95.0] 94.0 [92.0–96.0] 137.27 [89.50–210.53]
 6.2. External data 9 89.4 83.0 [79.0–86.0] 88.0 [83.0–92.0] 24.29 [5.84–101.11]
7.0. Electronic databases
 7.1. Publicly available 22 94.5 93.0 [85.0–97.0] 92.0 [91.0–94.0] 42.48 [20.16–89.50]
 7.2. Private data 41 97.0 92.0 [89.0–94.0] 94.0 [92.0–96.0] 183.30 [108.51–309.62]
 7.3. Mixed data (private + public data) 3 88.6 82.0 [79.0–85.0] 83.0 [70.0–91.0] 44.02 [11.38–170.26]
AI indicates artificial intelligence; AUC, area under receiver operating characteristic curve; CI, confidence interval; DA, diagnostic accuracy; DL, Deep Learning; DOR, diagnostic odds ratio; Mixed, monoscopic and stereoscopic images; ML, Machine learning; Mono, monoscopic image; N, numbers of studies; OC, optic cup; OCT, optical coherence tomography; OD, optic disc; ONH, optic nerve head; RNFL, retinal nerve fiber layer; Stereo, stereoscopic image.

Diagnostic Performance on Applied Techniques: Segmentation and Non–segmentation-based

Nonsegmentation techniques with DL or ML approaches had superior results and an overall pooled AUC of≥96.5% (Table 4). However, DL with segmentation-based resulted in the lowest diagnostic performance with AUC, sensitivity and specificity of 95.4%, 91.0% (95% CI: 82.0–96.0) and 93.0% (95% CI: 86.0–97.0), respectively. The SROC curve and forest plots for the applied techniques are shown in Supplementary Figure 2.3 and Forest Plots 3.0 (Supplemental Digital Content 1, https://links.lww.com/IJG/A602).

Diagnostic Performance on the ROI Selection

We determined that AI algorithms’ highest diagnostic performance was on full ONH images with AUC, sensitivity and specificity of 97.3%, 93.0% (95% CI: 90.0–96.0) and 95.0% (95% CI: 92.0–96.0), respectively. However, the lowest performance was on RNFL selection with AUC, sensitivity, and specificity of 91.3%, 81.0% (95% CI: 75.0–86.0), and 93.0% (95% CI: 87.0–96.0), respectively. The SROC curve and forest plots for the ROI selection; full ONH image, OD, OD and OC, and RNFL are displayed in Supplementary Figure 2.4 and Forest Plots 4.0 (Supplemental Digital Content 1, https://links.lww.com/IJG/A602).

Validation Accuracy on Internal and External Data

The overall AI diagnostic performance was higher when using internal data for validation; AUC 96.7%, sensitivity 93.0% (95% CI: 90.0–95.0), and specificity 94.0% (95% CI: 90.0–96.0). However, external data for validation revealed lower AUC of 89.4%, sensitivity of 83.0% (95% CI: 79.0–86.0), and specificity of 88.0% (95% CI: 83.0–92.0). The SROC curve and forest plots for internal and external data validation accuracy are outlined in Supplementary Figure 2.5 and Forest Plots 5.0 (Supplemental Digital Content 1, https://links.lww.com/IJG/A602.

Diagnostic Performance of AI Models on Data Sets Availability

Many of the studies (62%) utilized private data sets, with only 3 studies using a combination of private and publicly available data sets. We found that AI models performed the most accurately on the private data sets: AUC 97.0%, sensitivity 92.0% (95% CI: 89.0–94.0) and specificity 94.0% (95% CI: 92.0–96.0). However, the AI models’ diagnostic performance dropped with mixed data sets producing an AUC of 88.0% (Table 4). The SROC curve and forest plots for data sets are shown in Supplementary Figure 2.6 and Forest Plots 6.0 (Supplemental Digital Content 1, https://links.lww.com/IJG/A602.

DISCUSSION

We conducted a systematic review and meta-analysis of studies that developed various AI algorithms for glaucoma detection. The use of imaging technology and the application of AI in ophthalmology has increased over the last decade, with impressive diagnostic accomplishments using both OCT and fundus photography. However, these models are not yet clinically implemented for glaucoma detection. We examined and ascertained the factors affecting the diagnostic performance of AI models to highlight how to expand this growing technology into glaucoma screening and clinical practice.

This analysis demonstrates that AI algorithms perform well for diagnosing glaucoma, with both fundus and OCT imaging modalities having similar accuracy (96.2% AUC and 96.0% AUC, respectively). Murtagh and colleagues reported similar results in a previous systematic review but a lower AUC (92.3%) on OCT images.111 In addition, we evaluated performance on monoscopic and stereoscopic fundus images and found no significant difference in diagnostic performance, as measured by the variation in the AUC. As expected, AI models were better on homogeneous data sets than mixed modalities (eg, monoscopic and stereoscopic images). In clinical practice, multiple instruments are used for fundus photography. Therefore, AI models should be trained using mixed modality data sets to improve generalizability before being applied in a clinical setting.

ONH assessment is important in detecting early glaucoma.112 Two independent techniques were applied for automated quantification of the cup-to-disc ratio (CDR): segmentation-based and non–segmentation-based techniques. Our meta-analysis showed superior diagnostic performance using non–segmentation-based approaches. Secondly, we found that conventional ML approaches with non–segmentation-based techniques had the highest overall diagnostic performance, and DL with segmentation techniques generally had the lowest performance (Table 4). Accordingly, we recommend using nonsegmentation techniques for image-based glaucoma diagnosis. Furthermore, we also recommend selecting a full ONH image parameter for AI models to enhance the accuracy of glaucoma detection.

One of the most crucial factors in model performance is external validation to produce robust AI models. This is critical to the translation of AI into clinical practice. Most authors validated the AI model on internal data and established unrealistically excellent diagnostic performance (Tables 2, 3); this means the model may be unsuitable for clinical implementation. We recommend evaluating AI models on unseen heterogeneous data (external data validation). If the model achieves acceptable levels of diagnostic performance, then it can potentially be deployed into clinical practice.

The data set ground truth (healthy and glaucomatous) was most often established using a combination of CDR, OCT, and humphrey visual field testing. However, a potential source of bias between these studies exists in that the protocols for establishing the ground truth for glaucoma diagnosis varied between datasets. For example, Salam and colleagues graded as glaucoma if CDR >0.5; however, Li and colleagues labeled confirmed glaucoma if CDR ≥0.9 or disc damage (DD) likelihood scale ≤0.05 or RNFL defect corresponds to narrowing of the rim or localized notches.73,96 Some researchers classified healthy versus glaucoma only based on CDR.72,73,76,96,113 Furthermore, ~12% of manuscripts did not describe the labeling criterion.31–33,41,58,91,97,100 As such, there needs to be a consistent and established protocol for labeling of data sets to ensure that an appropriate ground truth (gold standard) is maintained before training the algorithms for clinical implementation.

The AI model input image dimensions were highly uneven across the studies due to the use of numerous imaging instruments with various models installed at ophthalmic clinics. The machines used across studies are displayed in Table 1. The input image size was generally inconsistent across studies, with all images being downsized to the most frequent final dimension of 224×224 pixels across 3 color channels (Table 1). The resized input images were trained with a high number of parameters in models that would probably overestimate the performance.

We identified a total of 79 studies for qualitative synthesis comprising private data sets (49), publicly available datasets (26), and mixed (public and private) data sets (4). Predominantly, the data were contributed from India (32%) and Spain (24%), followed by the United States (13%), Korea (11%), and others (Table 1). It was unclear what protocols were used to grade the images as either healthy or glaucomatous for most publicly available datasets. Many of these publicly available data sets were used in multiple studies. For instance, 2 studies used the same data set (EODAT), which contributed approximately a third of all fundus images to their data set,34,95 introducing some bias into our meta-analysis. We also observed that the ratio of glaucomatous to normal images ranged from 1:10 to 2:1. This variation also introduces bias, because model performance would vary depending on the relative balance of the training and validation data sets [eg, overperformance for primary class (healthy) and underperformance for minor class (glaucomatous)].114

The most commonly used traditional ML and DL classifiers were support vector machines and Convolutional Neural Networks, respectively. Most authors trained the traditional ML algorithms using K-fold cross-validation techniques to reduce the risk of overfitting and overcome data insufficiency issues. The Convolutional Neural Network–based architectures were pretrained on a large data set (over a million images with 1000 categories) of the ImageNet challenge and were then fine-tuned using transfer learning.115–118 Such architectures include VGG, Inception, and ResNet, all of which have achieved groundbreaking performance on the ImageNet challenge. The ResNet-152 achieved a 3.57% error rate on ImageNet test data119 and was the most frequently used DL architecture for transfer learning with high diagnostic performance in glaucoma.34,42,44,45,53,57,62,73,74,76,80,81,92,95,102,105,106

An important limitation of our work relates to the fact that ~10% of studies showed a high risk of bias in patient selection and selection of reference standards. Aside from the stereoscopic fundus image analysis, we also identified a high degree of heterogeneity in our subgroups analysis. We excluded the studies that reported the models’ performance in nonstandard metrics (eg, Jaccard index), and as such, it is possible that some critical models were missed.

POAG represents a continuum of disease ranging from “clinical suspects” through to early manifest disease and later advanced glaucoma. The rates of disease progression can vary between patients, and ongoing efforts could be used to apply AI in the prediction of likely disease trajectory. The stage of disease progression is an important issue in POAG assessment, and this presents an important challenge in applying AI to POAG. Future studies should focus on the optimal imaging modality and highest yield features for the subclassification of patients into various glaucoma stages, and disease trajectories to optimize future management.

Although AI has the potential to revolutionize glaucoma practice through imaging modalities, this meta-analysis highlights that before such algorithms can be incorporated into clinical care, a number of issues need to be addressed. With substantial heterogeneity across studies, many factors were found to affect the diagnostic performance, including reference standard, the instrument used for imaging modalities, data set selection, image dimensions, ML classifier, and use of transfer learning. We recommend implementing a standard diagnostic protocol for grading, implementing external data validation, and analysis across different ethnicity groups.

REFERENCES

1. Barkana Y, Dorairaj S. Re: Tham et al: Global prevalence of glaucoma and projections of glaucoma burden through 2040: a systematic review and meta-analysis (Ophthalmology 2014;121:2081–90). Ophthalmology. 2015;122:e40–e41.
2. Kapetanakis VV, Chan MPY, Foster PJ, et al. Global variations and time trends in the prevalence of primary open angle glaucoma (POAG): a systematic review and meta-analysis. Br J Ophthalmol. 2016;100:86.
3. Quigley HA, Broman AT. The number of people with glaucoma worldwide in 2010 and 2020. Br J Ophthalmol. 2006;90:262–267.
4. World Population Prospects—Population Division—United Nations. 2020. Available at: https://population.un.org/wpp/Download/Probabilistic/Population/. Accessed September 16, 2020.
5. Sun J, Zhou X, Kang Y, et al. Prevalence and risk factors for primary open-angle glaucoma in a rural northeast China population: a population-based survey in Bin County, Harbin. Eye. 2012;26:283–291.
6. Yamamoto S, Sawaguchi S, Iwase A, et al. Primary open-angle glaucoma in a population associated with high prevalence of primary angle-closure glaucoma: the Kumejima Study. Ophthalmology. 2014;121:1558–1565.
7. Lee PP, Walt JG, Doyle JJ, et al. A multicenter, retrospective pilot study of resource use and costs associated with severity of disease in glaucoma. Arch Ophthalmol. 2006;124:12–19.
8. Rein DB, Zhang P, Wirth KE, et al. The economic burden of major adult visual disorders in the United States. Arch Ophthalmol. 2006;124:1754–1760.
9. Cases and costs of glaucoma projected to soar. 2021. Available at: https://preventblindness.org/cases-and-costs-of-glaucoma-projected-to-soar-2/. Accessed March 2, 2021.
10. Myers JS, Fudemberg SJ, Lee D. Evolution of optic nerve photography for glaucoma screening: a review. Clin Experiment Ophthalmol. 2018;46:169–176.
11. Zhang X, Dastiridou A, Francis BA, et al. Comparison of glaucoma progression detection by optical coherence tomography and visual field. Am J Ophthalmol. 2017;184:63–74.
12. Katie A, Lucy GW. Structural and functional evaluations for the early detection of glaucoma. Expert Rev Ophthalmol. 2016;11:367.
13. Burr JM, Mowatt G, Hernández R, et al. The clinical effectiveness and cost-effectiveness of screening for open angle glaucoma: a systematic review and economic evaluation. Health Technol Assess. 2007;11:41.
14. Burr J, Hernández R, Ramsay C, et al. Is it worthwhile to conduct a randomized controlled trial of glaucoma screening in the United Kingdom? J Health Serv Res Policy. 2014;19:42–51.
15. Ahmad SS. Glaucoma suspects: a practical approach. Taiwan J Ophthalmol. 2018;8:74.
16. Thompson AC, Jammal AA, Medeiros FA. A review of deep learning for screening, diagnosis, and detection of glaucoma progression. Transl Vis Sci Technol. 2020;9:42.
17. McInnes MDF, Moher D, Thombs BD, et al. Preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies: the PRISMA-DTA Statement. JAMA. 2018;319:388–396.
18. RevMan Calculator. 2020. Available at: https://training.cochrane.org/resource/revman-calculator. Accessed November 26, 2020.
19. Whiting PF, Rutjes AW, Westwood ME, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155:529–536.
20. University of Bristol. QUADAS. 2010. Available at: http://www.bristol.ac.uk/population-health-sciences/projects/quadas/. Accessed November 26, 2020.
21. The Comprehensive R Archive Network. 2020. Available at: https://cran.r-project.org/. Accessed November 27, 2020.
22. Mada: Meta-Analysis of Diagnostic Accuracy. 2020. Available at: https://CRAN.R-project.org/package=mada. Accessed November 27, 2020.
23. Download the R-4.0.4 Patched build for Windows. The R-project for statistical computing. 2021. Available at: https://cran.rstudio.com/bin/windows/base/rtest.html. Accessed February 23, 2021.
24. Welcome. 2020. Available at: https://methods.cochrane.org/sdt/. Accessed November 27, 2020.
25. Trikalinos TA, Balion CM, Coleman CI, et al. Chapter 8: meta-analysis of test performance when there is a “Gold Standard”. J Gen Intern Med. 2012;27(suppl 1):56.
26. Lee J, Kim KW, Choi SH, et al. Systematic review and meta-analysis of studies evaluating diagnostic test accuracy: a practical review for clinical researchers-part II. Statistical methods of meta-analysis. Korean J Radiol. 2015;16:1188.
27. Glas AS, Lijmer JG, Prins MH, et al. The diagnostic odds ratio: a single indicator of test performance. J Clin Epidemiol. 2003;56:1129–1135.
28. Harbord RH, Whiting P, Sterne JAC, et al. An empirical comparison of methods for meta-analysis of diagnostic accuracy showed hierarchical models are necessary. J Clin Epidemiol. 2008;61:1095–1103.
29. 9.5.2 Identifying and measuring heterogeneity. 2011. Available at: https://handbook-5-1.cochrane.org/chapter_9/9_5_2_identifying_and_measuring_heterogeneity.htm. Accessed February 23, 2021.
30. van Enst WA, Ochodo E, Scholten RJ, et al. Investigation of publication bias in meta-analyses of diagnostic test accuracy: a meta-epidemiological study. BMC Med Res Methodol. 2014;14:70.
31. Abidi SSR, Roy PC, Shah MS, et al. A data mining framework for glaucoma decision support based on optic nerve image analysis using machine learning methods. J Healthc Inform Res. 2018;2:370–401.
32. Acharya UR, Bhat S, Koh JEW, et al. A novel algorithm to detect glaucoma risk using texton and local configuration pattern features extracted from fundus images. Comput Biol Med. 2017;88:72–83.
33. Acharya UR, Ng EYK, JieEugene LW, et al. Decision support system for the glaucoma using Gabor transformation. Biomed Signal Process Control. 2015;15:18–26.
34. Al-Aswad LA, Kapoor R, Chu CK, et al. Evaluation of a deep learning system for identifying glaucomatous optic neuropathy based on color fundus photographs. J Glaucoma. 2019;28:1029–1034.
35. Al-Bander B, Al-Nuaimy W, Al-Taee MA, et al. Automated glaucoma diagnosis using deep learning approach—IEEE Conference Publication. 2017. Available at: https://ieeexplore.ieee.org/abstract/document/8166974. Accessed January 6, 2021.
    36. Asaoka R, Murata H, Hirasawa K, et al. Using deep learning and transfer learning to accurately diagnose early-onset glaucoma from macular optical coherence tomography images. Am J Ophthalmol. 2019;198:136–145.
    37. Bajwa MN, Malik MI, Siddiqui SA, et al. Two-stage framework for optic disc localization and glaucoma classification in retinal fundus images using deep learning. BMC Med Inform Decis Mak. 2019;19:1–16.
    38. Balasubramanian K, Ananthamoorthy NP, Gayathridevi K. Automatic diagnosis and classification of glaucoma using hybrid features and k-nearest neighbor. J Med Imaging Health Inform. 2018;8:8.
      39. Bock R, Meier J, Nyúl LG, et al. Glaucoma risk index: automated glaucoma detection from color fundus images. Med Image Anal. 2010;14:3.
        40. Chakrabarty L, Joshi GD, Chakravarty A, et al. Automated detection of glaucoma from topographic features of the optic nerve head in color fundus photographs. J Glaucoma. 2016;25:590–597.
        41. Chan YM, Ng EYK, Jahmunah V, et al. Automated detection of glaucoma using optical coherence tomography angiogram images. Comput Biol Med. 2019;115:103483.
        42. Chang J, Lee J, Ha A, et al. Explaining the rationale of deep learning glaucoma decisions with adversarial examples. Ophthalmology. 2021;128:78–88.
        43. Cheriguene S, Azizi N, Djellali H, et al. New computer aided diagnosis system for glaucoma disease based on twin support vector machine—IEEE Conference Publication. 2018. Available at: https://ieeexplore.ieee.org/document/8284039. Accessed January 6, 2021.
          44. Christopher M, Nakahara K, Bowd C, et al. Effects of study population, labeling and training on glaucoma detection using deep learning algorithms. Transl Vis Sci Technol. 2020;9:27.
          45. Christopher M, Belghith A, Bowd C, et al. Performance of deep learning architectures and transfer learning for detecting glaucomatous optic neuropathy in fundus photographs. Sci Rep. 2018;8:16685.
          46. Civit-Masot J, Dominguez-Morales J, Vicente-Diaz S, et al. Dual machine-learning system to aid glaucoma diagnosis using disc and cup feature extraction. 2020. Available at: https://ieeexplore.ieee.org/abstract/document/9138371. Accessed January 6, 2021.
            47. Diaz-Pinto A, Morales S, Naranjo V, et al. CNNs for automatic glaucoma assessment using fundus images: an extensive validation. Biomed Eng Online. 2019;18:29.
            48. Diaz-Pinto A, Colomer A, Naranjo V, et al. Retinal image synthesis and semi-supervised learning for glaucoma assessment—IEEE Journals & Magazine. 2019. Available at: https://ieeexplore.ieee.org/document/8662628. Accessed January 6, 2021.
              49. Elakkiya B, Saraniya O. A comparative analysis of pretrained and transfer-learning model for automatic diagnosis of glaucoma—IEEE Conference Publication. 2020. Available at: https://ieeexplore.ieee.org/document/9087297. Accessed January 6, 2021.
                50. Elseid G, Hamza AA, Alnazier O, et al. 2019. Available at: https://insights.ovid.com/clinical-engineering/jceng/2019/10/000/glaucoma-detection-using-retinal-nerve-fiber-layer/15/00004669. Accessed January 6, 2021.
                  51. Ferreira MVdS, Filho AOdC, Sousa ADd, et al. Convolutional neural network and texture descriptor-based automatic detection and diagnosis of glaucoma. Expert Syst Appl. 2018;110:250–263.
                  52. Fu H, Li F, Xu Y, et al. A retrospective comparison of deep learning to manual annotations for optic disc and optic cup segmentation in fundus photographs. Transl Vis Sci Technol. 2020;9:33.
                    53. Fu H, Cheng J, Xu Y, et al. Disc-aware ensemble network for glaucoma screening from fundus image. IEEE Trans Med Imaging. 2018;37:2493–2501.
                    54. Gaddipati DJ, Desai A, Sivaswamy J, et al. Glaucoma assessment from OCT images using capsule network. Conf Proc IEEE Eng Med Biol Soc. 2019:5581–5584.
                      55. Gómez-Valverde JJ, Antón A, Fatti G, et al. Automatic glaucoma classification using color fundus images based on convolutional neural networks and transfer learning. Biomed Opt Express. 2019;10:892–913.
                        56. Haleem MS, Han L, Jv H, et al. Regional image features model for automatic classification between normal and glaucoma in fundus and scanning laser ophthalmoscopy (SLO) images. J Med Syst. 2016;40:132.
                        57. Hemelings R, Elen B, Barbosa-Breda J, et al. Accurate prediction of glaucoma from colour fundus images with a convolutional neural network that relies on active and transfer learning. Acta Ophthalmol. 2019;98:e94–e100.
                        58. Issac A, Sarathi MP, Dutta MK. An adaptive threshold based image processing technique for improved glaucoma detection and classification. Comput Methods Programs Biomed. 2015;122:229–244.
                        59. Jiang Y, Duan L, Cheng J, et al. JointRCNN: a region-based convolutional neural network for optic disc and cup segmentation. IEEE Trans Biomed Eng. 2020;67:335–343.
                        60. Karkuzhali S, Manimegalai D. Computational intelligence-based decision support system for glaucoma detection. Biomed Res. 2017;28:11.
                          61. Kausu TR, Gopi VP, Wahid K, et al. Combination of clinical and multiresolution features for glaucoma detection and its classification using fundus images. Biocybern Biomed Eng. 2018;38:329–341.
                          62. Kim M, Han JC, Hyun SH, et al. Medinoid: computer-aided diagnosis and localization of glaucoma using deep learning. NATO Adv Sci Inst Ser E Appl Sci. 2019;9:3064.
                          63. Kim KE, Kim JM, Song JE, et al. Development and validation of a deep learning system for diagnosing glaucoma using optical coherence tomography. J Clin Med Res. 2020;9:7.
                            64. Kim J, Tran L, Chew EY, et al. Optic Disc and Cup Segmentation for Glaucoma Characterization Using Deep Learning—IEEE Conference Publication. 2019. Available at: https://ieeexplore.ieee.org/document/8787505. Accessed January 6, 2021.
                              65. IEEE Xplore. 2016. Available at: https://ieeexplore.ieee.org/document/8574084. Accessed January 6, 2021.
                                66. Kishore B, Ananthamoorthy NP. Glaucoma classification based on intra-class and extra-class discriminative correlation and consensus ensemble classifier. Genomics. 2020;112:5.
                                  67. Ko Y-C, Wey S-Y, Chen W-T, et al. Deep learning assisted detection of glaucomatous optic neuropathy and potential designs for a generalizable model. PLoS One. 2020;15:5.
                                    68. Lee J, Kim JS, Lee HJ, et al. Discriminating glaucomatous and compressive optic neuropathy on spectral-domain optical coherence tomography with deep learning classifier. Br J Ophthalmol. 2020;104:12.
                                      69. Lee J, Kim YK, Park KH, et al. Diagnosing glaucoma with spectral-domain optical coherence tomography using deep learning classifier. J Glaucoma. 2020;29:4.
                                        70. Lee J, Kim Y, Kim JH, et al. Screening glaucoma with red-free fundus photography using deep learning classifier and polar transformation. J Glaucoma. 2019;28:3.
                                          71. Li L, Xu M, Liu H, et al. A large-scale database and a CNN model for attention-based glaucoma detection—IEEE Journals & Magazine. 2019. Available at: https://ieeexplore.ieee.org/abstract/document/8756196. Accessed January 6, 2021.
                                            72. Li Z, He Y, Keel S, et al. Efficacy of a deep learning system for detecting glaucomatous optic neuropathy based on color fundus photographs. Ophthalmology. 2018;125:1199–1206.
                                            73. Li F, Yan L, Wang Y, et al. Deep learning-based automated detection of glaucomatous optic neuropathy on color fundus photographs. Graefes Arch Clin Exp Ophthalmol. 2020;258:4.
                                            74. Liu S, Graham SL, Schulz A, et al. A deep learning-based algorithm identifies glaucomatous discs using monoscopic fundus photographs. Ophthalmol Glaucoma. 2018;1:1.
                                            75. Liu S, Hong J, Lu X, et al. Joint optic disc and cup segmentation using semi-supervised conditional GANs. Comput Biol Med. 2019;115:103485.
                                            76. Liu H, Li L, Wormstone IM, et al. Development and validation of a deep learning system to detect glaucomatous optic neuropathy using fundus photographs. JAMA Ophthalmol. 2019;137:1353–1360.
                                            77. MacCormick IJC, Williams BM, Zheng Y, et al. Accurate, fast, data efficient and interpretable glaucoma diagnosis with automated spatial analysis of the whole cup to disc profile. PLoS One. 2019;14:1.
                                              78. Maheshwari S, Kanhangad V, Pachori RB, et al. Automated glaucoma diagnosis using bit-plane slicing and local binary pattern techniques. Comput Biol Med. 2019;105:72–80.
                                              79. Martins J, Cardoso JS, Soares F. Offline computer-aided diagnosis for glaucoma detection using fundus images targeted at mobile devices. Comput Methods Programs Biomed. 2020;192:105341.
                                              80. Medeiros FA, Jammal AA, Thompson AC. From machine to machine: an OCT-trained deep learning algorithm for objective quantification of glaucomatous damage in fundus photographs. Ophthalmology. 2019;126:513–521.
                                              81. Medeiros FA, Jammal AA, Mariottoni EB, et al. Detection of progressive glaucomatous optic nerve damage on fundus photographs with deep learning. Ophthalmology. 2020;128:83–392.
                                              82. Mukherjee R, Kundu S, Dutta K, et al. Predictive diagnosis of glaucoma based on analysis of focal notching along the neuro-retinal rim using machine learning. Pattern Recognit Image Anal. 2019;29:523–532.
                                              83. Mvoulana A, Kachouri R, Akil M. Fully automated method for glaucoma screening using robust optic nerve head detection and unsupervised segmentation based cup-to-disc ratio computation in retinal fundus images. Comput Med Imaging Graph. 2019;77:523–532.
                                                84. Norouzifard M, Nemati A, Hosseini HG, et al. Automated glaucoma diagnosis using deep and transfer learning: proposal of a system for clinical testing—IEEE Conference Publication. 2019. Available at: https://ieeexplore.ieee.org/document/8634671. Accessed January 6, 2021.
                                                  85. Oh JE, Yang HK, Kim KG, et al. Automatic computer-aided diagnosis of retinal nerve fiber layer defects using fundus photographs in optic neuropathy. Invest Ophthalmol Vis Sci. 2015;56:5.
                                                    86. Patil N, Rao PV. 2019. Available at: https://www.ijeat.org/wp-content/uploads/papers/v9i1/A2960109119.pdf. Accessed January 6, 2021.
                                                      87. Phene S, Dunn RC, Hammel N, et al. Deep learning and glaucoma specialists: the relative importance of optic disc features to predict glaucoma referral in fundus photographs. Ophthalmology. 2019;126:1627–1639.
                                                      88. Abbas Q. Glaucoma-deep: detection of glaucoma eye disease on retinal fundus images using deep learning. Int J Adv Comput Sci Appl (IJACSA). 2017;8:6.
                                                        89. Raghavendra U, Gudigar A, Bhandary SV, et al. A two layer sparse autoencoder for glaucoma identification with fundus images. J Med Syst. 2019;43:1–9.
                                                          90. Raja C, Gangatharan N. A hybrid swarm algorithm for optimizing glaucoma diagnosis. Comput Biol Med. 2015;63:196–207.
                                                          91. Rajan A, Ramesh GP. Automated early detection of glaucoma in wavelet domain using optical coherence tomography images. Biomed Pharmacol J. 2015;8:641–649.
                                                          92. Ran AR, Cheung CY, Wang X, et al. Detection of glaucomatous optic neuropathy with spectral-domain optical coherence tomography: a retrospective training and validation deep-learning analysis. Lancet Digital Health. 2019;1:e172–e182.
                                                          93. Rao MN, Rao MVG. 2016. Available at: https://www.iioab.org/articles/IIOABJ_7.9_812-824.pdf. Accessed January 6, 2021.
                                                            94. Renukalatha S, Suresh KV. 2019. Available at: https://www.worldscientific.com/doi/abs/10.4015/S101623721950039X. Accessed January 6, 2021.
                                                              95. Rogers TW, Jaccard N, Carbonaro F, et al. Evaluation of an AI system for the automated detection of glaucoma from stereoscopic optic disc photographs: the European Optic Disc Assessment Study. Eye. 2019;33:1791–1797.
                                                              96. Salam AA, Khalil T, Akram MU, et al. Automated detection of glaucoma using structural and non structural features. Springerplus. 2016;5:1519.
                                                              97. Sathiya KG, Srinivasan S, Sivakumaran TS. Decision support system for glaucoma diagnosis using optical coherence tomography images. Res J Pharm Technol. 2018;11:1860–1866.
                                                              98. Serener A, Serte S. Transfer learning for early and advanced glaucoma detection with convolutional neural networks—IEEE Conference Publication. 2019. Available at: https://ieeexplore.ieee.org/abstract/document/8894965. Accessed January 6, 2021.
                                                                99. Sharma A, Aggarwal M, Roy SD, et al. Automatic glaucoma diagnosis in digital fundus images using convolutional neural network—IEEE Conference Publication. 2020. Available at: https://ieeexplore.ieee.org/document/8988512. Accessed January 6, 2021.
                                                                  100. Singh A, Dutta MK, Sarathi MP, et al. Image processing based automatic diagnosis of glaucoma using wavelet features of segmented optic disc from fundus image. Comput Methods Programs Biomed. 2016;124:108–120.
                                                                  101. Soorya M, Issac A, Dutta MK. Automated framework for screening of glaucoma through cloud computing. J Med Syst. 2019;43:5.
                                                                    102. Thompson AC, Jammal AA, Berchuck SI, et al. Assessment of a segmentation-free deep learning algorithm for diagnosing glaucoma from optical coherence tomography scans. JAMA Ophthalmol. 2020;138:333–339.
                                                                    103. Ting DSW, Cheung CY, Lim G, et al. Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA. 2017;318:22.
                                                                      104. Touhari R, Azizi N, Benzebouchi NE, et al. A comparative study of convolutional neural network and twin SVM for automatic glaucoma diagnosis—IEEE Conference Publication. 2019. Available at: https://ieeexplore.ieee.org/document/8661076. Accessed January 6, 2021.
                                                                        105. Wang P, Shen J, Chang R, et al. Machine learning models for diagnosing glaucoma from retinal nerve fiber layer thickness maps. Ophthalmol Glaucoma. 2019;2:6.
                                                                        106. Yang HK, Kim YJ, Sung JY, et al. Efficacy for differentiating nonglaucomatous versus glaucomatous optic neuropathy using deep learning systems. Am J Ophthalmol. 2020;216:140–146.
                                                                        107. Zapata MA, Royo-Fibla D, Font O, et al. Artificial intelligence to identify retinal fundus images, quality validation, laterality evaluation, macular degeneration, and suspected glaucoma. Clin Ophthalmol. 2020;14:419.
                                                                          108. Zheng C, Xie X, Huang L, et al. Detecting glaucoma based on spectral domain optical coherence tomography imaging of peripapillary retinal nerve fiber layer: a comparison study between hand-crafted features and deep learning model. Graefes Arch Clin Exp Ophthalmol. 2019;258:577–585.
                                                                          109. Zilly J, Buhmann JM, Mahapatra D. Glaucoma detection using entropy sampling and ensemble learning for automatic optic cup and disc segmentation. Comput Med Imaging Graph. 2017;55:28–41.
                                                                          110. Deeks JJ, Macaskill P, Irwig L. The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed. J Clin Epidemiol. 2005;58:9.
                                                                          111. Murtagh P, Greene G, O’Brien C. Current applications of machine learning in the screening and diagnosis of glaucoma: a systematic review and meta-analysis. Int J Ophthalmol. 2020;13:149.
                                                                          112. Nicolela MT, Vianna JR. Optic nerve: clinical examination. Pearls of Glaucoma Management. Berlin, Heidelberg: Springer; 2016:17–26.
                                                                          113. Norouzifard M, Nemati A, Hosseini HG, et al. Automated glaucoma diagnosis using deep and transfer learning: proposal of a system for clinical testing—IEEE Conference Publication. 2019. Available at: https://ieeexplore.ieee.org/document/8634671. Accessed January 16, 2021.
                                                                          114. Johnson JM, Khoshgoftaar TM. Survey on deep learning with class imbalance. J Big Data. 2019;6:1–54.
                                                                          115. He K, Zhang X, Ren S, et al. Website. Available at: https://arxiv.org/abs/1512.03385. Accessed July 16, 2020.
                                                                          116. ImageNet. Available at: http://image-net.org/. Accessed July 16, 2020.
                                                                          117. Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions. 2014. Available at: http://arxiv.org/abs/1409.4842. Accessed July 16, 2020.
                                                                          118. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. Available at: http://arxiv.org/abs/1409.1556. Accessed July 16, 2020.
                                                                          119. Zhao B, Feng J, Wu X, et al. A survey on deep learning-based fine-grained object classification and semantic segmentation. Int J Autom Comput. 2017;14:119–135.
                                                                          Keywords:

                                                                          artificial intelligence; deep learning; machine learning; glaucoma; meta-analysis

                                                                          Supplemental Digital Content

                                                                          Copyright © 2022 The Author(s). Published by Wolters Kluwer Health, Inc.