Glaucoma, an irreversible cause of blindness worldwide, necessitates accurate and timely detection for efficacious treatment.1–3 Deep learning (DL) models have recently become powerful tools in glaucoma research, utilizing intricate algorithms to extract valuable data from medical images.4 The application of DL models in glaucoma screening, diagnosis, and classification has yielded promising results.5 Nevertheless, these models present a key challenge: their opacity, or the “black box” phenomenon, which leads to a lack of clarity about the processes underlying final predictions. This opacity incites substantial concerns about the models’ interpretability and potential embedded biases, possibly affecting the reliability of the outcomes and, subsequently, the quality of patient care. Given these challenges, visualization techniques have emerged as an influential tool to augment the interpretability of these intricate DL models. These techniques yield a visual representation of the model’s decision-making process, empowering users to comprehend how inputs are converted into outputs. By visually mapping these transformations, it is feasible to discern the factors that the model deems most significant in generating predictions, potentially aiding in bias identification.
This issue of the Asia-Pacific Journal of Ophthalmology covers a narrative review by Gu et al6 that presents a comprehensive overview of the different types of data and visualization techniques that are used to illustrate predictions made by DL models. The article delves into novel strategies to improve the interpretation of predictions made by artificial intelligence (AI) models. These innovative approaches involve an alignment of the findings with the viewpoints of clinical practitioners. This aspect is critical, as understanding AI model predictions from a practitioner’s perspective ensures that these predictions are applicable and beneficial in a real-world clinical context. By incorporating the practical insights of clinicians, these strategies foster a more effective integration of AI technology in health care practice.
During the process of development, visualization techniques in imaging data fall into 3 primary categories: gradient-based, perturbation-based, and attention-based techniques. Of these, gradient-based techniques, such as saliency maps, gradient-weighted class activation mapping, integrated gradients, and layer-wise relevance propagation, are most frequently employed in ophthalmological studies for detecting and screening diabetic retinopathy and glaucoma.7,8 Among the frequently used methods for scrutinizing visual cues utilized by deep neural networks, gradient-weighted class activation mapping and its derivatives have undergone successive enhancements, leading to increasingly high-resolution and efficient output maps. These advancements have resulted in improved detection and classification accuracy, thereby strengthening their ability to precisely pinpoint the extent of lesions.
Contrarily, perturbation-based methods modify or eliminate sections of the image using a blank sliding window or random masks to observe their influence on the algorithm.9 The most discriminative regions are subsequently identified based on the impact of these diverse regions on the overall accuracy. However, this method tends to be computationally intensive and has garnered scant attention in the field of glaucoma research. Attention-based methodologies produce attention maps by subduing irrelevant features during the training phase. Playout et al implemented this technique on the attribution map for the classification of retinal images.10 These attention-based strategies have been effective in recognizing relevant regions and minimizing irrelevant or redundant features that could affect the model’s accuracy. Progress in visualization techniques has allowed these methods to detect even subtle variations in the features contributing to the severity or progression of the disease. Such techniques play an indispensable role in interpreting the results of glaucoma diagnosis and assisting clinicians in understanding the pathophysiology of the disease.
Tabular data, such as the data from electronic health records, are also frequently used in developing DL models. Gu et al also discussed the strategies to improve the interpretability of clinical features from tabular data used to train explainable AI models.6 Among these, Local Interpretable Model-Agnostic Explanations (LIME)11 provide key feature visualizations for a model’s glaucoma classification, increasing medical professionals’ trust, and the Submodular Pick Local Interpretable Model-Agnostic Explanation (SP-LIME)12 explicates predictive results and glaucoma risk factors, facilitating clearer decision-making. Shapley's value from Cooperative Game Theory, used to quantify individual contributions, is another form of explainable AI. Implemented in the XGBoost algorithm and SHapley Additive exPlanations (SHAP),13 it elucidates the reasoning behind DL output. However, considering the scarcity of explainable AI research within the glaucoma domain, this field necessitates additional exploration.
During the process of clinical deployment, a user-friendly interface could further enhance the explainability of AI models and reduce the cognitive burden of the users. Gu et al further investigated several examples of published dashboards or interfaces with a focus on DL predictions to facilitate end-user engagement, such as the GLANCE interface.14 The relevant study showed that AI models that elucidate automated decisions can enhance clinicians’ understanding and align their trust in DL-based measurements during clinical decision-making. Specifically, clinicians altered their initial management choice and confidence in their predictions in 31% of the cases after reviewing the DL model’s results with a visual heatmap explanation. In contrast, without the aid of a heatmap, only 11% of the cases resulted in a change of opinion.14 The development and implementation of standards for various data types will be crucial in accelerating AI techniques in ophthalmology.
To summarize, DL holds significant promise to revolutionize glaucoma diagnosis and management through medical image analysis. However, deciphering the inner mechanisms of DL models remains a prerequisite for clinical adoption and incorporation into existing workflows. This issue underscores key visualization strategies under investigation that tackle critical issues of trust, reliability, and explainability. The ultimate success of DL in glaucoma care hinges on striking a suitable balance between performance and transparency. As research and technology advance, DL-assisted diagnostics and monitoring backed by interpretable DL models may play a pivotal role in individualizing glaucoma management. The insights from the review by Gu et al2 can potentially guide the development of future AI tools that are not only powerful in their predictive capacities but also intuitive and user-friendly for the clinicians who will be utilizing them.
REFERENCES
1. Leshno A, Liebmann J. The glaucoma suspect problem: ways forward.
Asia Pac. J Ophthalmol (Phila). 2022;11:503–504.
2. Huang OS, Chew ACY, Finkelstein EA, et al. Outcomes of an asynchronous virtual glaucoma clinic in monitoring patients at low risk of glaucoma progression in Singapore.
Asia Pac. J Ophthalmol (Phila). 2021;10:328–334.
3. Yuan Y, Hu W, Zhang X, et al. Daily patterns of accelerometer-measured movement behaviors in glaucoma patients: insights from UK Biobank participants.
Asia Pac. J Ophthalmol (Phila). 2022;11:521–528.
4. Lee EB, Wang SY, Chang RT. Interpreting deep learning studies in glaucoma: unresolved challenges.
Asia Pac. J Ophthalmol (Phila). 2021;10:261–267.
5. Ting DSW, Peng L, Varadarajan AV, et al. Deep learning in ophthalmology: the technical and clinical considerations. Prog Retin Eye Res. 2019;72:100759.
6. Gu B, Sidhu S, Weinreb RN, et al. Review of visualization approaches in deep learning models of glaucoma.
Asia Pac. J Ophthalmol (Phila). 2023;12:392–401.
7. Phene S, Dunn RC, Hammel N, et al. Deep learning and glaucoma specialists: the relative importance of optic disc features to predict glaucoma referral in fundus photographs. Ophthalmology. 2019;126:1627–1639.
8. Thakoor KA, Li X, Tsamis E, et al. Enhancing the accuracy of glaucoma detection from OCT probability maps using convolutional neural networks. Annu Int Conf IEEE Eng Med Biol Soc. 2019;2019:2036–2040.
9. Mohamed E, Sirlantzis K, Howells G. A review of visualisation-as-explanation techniques for convolutional neural networks and their evaluation. Displays. 2022;73:102239.
10. Playout C, Duval R, Boucher MC, et al. Focused Attention in Transformers for interpretable classification of retinal images. Med Image Anal. 2022;82:102608.
11. Chayan TI, Islam A, Rahman E, et al. Explainable AI based glaucoma detection using transfer learning and LIME.
2022 IEEE Asia-Pacific Conference on Computer Science and Data Engineering. IEEE;2022:1–6.
12. Kamal MS, Dey N, Chowdhury L, et al. Explainable AI for glaucoma prediction analysis to understand risk factors in treatment planning. IEEE Trans Instrum Meas. 2022;71:1–9.
13. Oh S, Park Y, Cho KJ, et al. Explainable machine learning model for glaucoma diagnosis and its interpretation. Diagnostics. 2021;11:510.
14. van den Brandt A, Christopher M, Zangwill LM, et al. GLANCE: visual analytics for monitoring glaucoma progression. VCBM. 2020:85–96.