The aim of this study was to compare the diagnostic performance of a deep learning algorithm with that of radiologists in diagnosing maxillary sinusitis on Waters’ view radiographs.
Among 80,475 Waters’ view radiographs, examined between May 2003 and February 2017, 9000 randomly selected cases were classified as normal or maxillary sinusitis based on radiographic findings and divided into training (n = 8000) and validation (n = 1000) sets to develop a deep learning algorithm. Two test sets composed of Waters’ view radiographs with concurrent paranasal sinus computed tomography were labeled based on computed tomography findings: one with temporal separation (n = 140) and the other with geographic separation (n = 200) from the training set. Area under the receiver operating characteristics curve (AUC), sensitivity, and specificity of the algorithm and 5 radiologists were assessed. Interobserver agreement between the algorithm and majority decision of the radiologists was measured. The correlation coefficient between the predicted probability of the algorithm and average confidence level of the radiologists was determined.
The AUCs of the deep learning algorithm were 0.93 and 0.88 for the temporal and geographic external test sets, respectively. The AUCs of the radiologists were 0.83 to 0.89 for the temporal and 0.75 to 0.84 for the geographic external test sets. The deep learning algorithm showed statistically significantly higher AUC than radiologist in both test sets. In terms of sensitivity and specificity, the deep learning algorithm was comparable to the radiologists. A strong interobserver agreement was noted between the algorithm and radiologists (Cohen κ coefficient, 0.82). The correlation coefficient between the predicted probability of the algorithm and confidence level of radiologists was 0.89 and 0.84 for the 2 test sets, respectively.
The deep learning algorithm could diagnose maxillary sinusitis on Waters’ view radiograph with superior AUC and comparable sensitivity and specificity to those of radiologists.
From the *Department of Radiology, Seoul National University College of Medicine, Seoul;
†Department of Radiology, Seoul National University Bundang Hospital, Seongnam;
‡Department of Radiology, Hallym University Sacred Heart Hospital, Anyang; and
§Department of Radiology, Seoul National University Hospital, Seoul, Republic of Korea.
Received for publication May 10, 2018; and accepted for publication, after revision, June 15, 2018.
Youngjune Kim and Kyong Joon Lee contributed equally to this work.
Conflicts of interest and sources of funding: This study was supported by grants from the National Research Foundation of Korea (NRF-2015R1C1A1A02037475 and NRF-2018R1C1B6007917) and by grants from the SNUBH Research Fund (no. 02-2017-029 and no. 18-2018-016). This study was also supported by the Technology Innovation Program funded by the Ministry of Trade, Industry, and Energy of Korea [10049785, Development of “medical equipment using (ionizing or non-ionizing) radiation”–dedicated R&D platform and medical device technology].
Correspondence to: Leonard Sunwoo, MD, PhD, Department of Radiology, Seoul National University Bundang Hospital, 82 Gumi-ro 173 Beon-Gil, Seongnam, Gyeonggi-do, 13620, Republic of Korea. E-mail: firstname.lastname@example.org.