Institutional members access full text with Ovid®

Training and Validating a Deep Convolutional Neural Network for Computer-Aided Detection and Classification of Abnormalities on Frontal Chest Radiographs

Cicero, Mark MD, BESc; Bilbily, Alexander MD, BHSc; Colak, Errol MD, FRCPC, HBSc; Dowdell, Tim MD, CCFP, FRCPC; Gray, Bruce MD, FRCPC, BSc; Perampaladas, Kuhan MSc, BSc; Barfett, Joseph MD, FRCPC, MSc, BESc

doi: 10.1097/RLI.0000000000000341
Original Articles

Objectives: Convolutional neural networks (CNNs) are a subtype of artificial neural network that have shown strong performance in computer vision tasks including image classification. To date, there has been limited application of CNNs to chest radiographs, the most frequently performed medical imaging study. We hypothesize CNNs can learn to classify frontal chest radiographs according to common findings from a sufficiently large data set.

Materials and Methods: Our institution's research ethics board approved a single-center retrospective review of 35,038 adult posterior-anterior chest radiographs and final reports performed between 2005 and 2015 (56% men, average age of 56, patient type: 24% inpatient, 39% outpatient, 37% emergency department) with a waiver for informed consent. The GoogLeNet CNN was trained using 3 graphics processing units to automatically classify radiographs as normal (n = 11,702) or into 1 or more of cardiomegaly (n = 9240), consolidation (n = 6788), pleural effusion (n = 7786), pulmonary edema (n = 1286), or pneumothorax (n = 1299). The network's performance was evaluated using receiver operating curve analysis on a test set of 2443 radiographs with the criterion standard being board-certified radiologist interpretation.

Results: Using 256 × 256-pixel images as input, the network achieved an overall sensitivity and specificity of 91% with an area under the curve of 0.964 for classifying a study as normal (n = 1203). For the abnormal categories, the sensitivity, specificity, and area under the curve, respectively, were 91%, 91%, and 0.962 for pleural effusion (n = 782), 82%, 82%, and 0.868 for pulmonary edema (n = 356), 74%, 75%, and 0.850 for consolidation (n = 214), 81%, 80%, and 0.875 for cardiomegaly (n = 482), and 78%, 78%, and 0.861 for pneumothorax (n = 167).

Conclusions: Current deep CNN architectures can be trained with modest-sized medical data sets to achieve clinically useful performance at detecting and excluding common pathology on chest radiographs.

From the *Department of Medical Imaging, St Michael's Hospital, and †Department of Pharmaceutical Sciences, University of Toronto, Toronto, Ontario, Canada.

Received for publication September 17, 2016; and accepted for publication, after revision, October 24, 2016.

Correspondence to: Mark Cicero, MD, BESc, Department of Medical Imaging, St Michael's Hospital, 30 Bond St, Toronto, Ontario, Canada M5B 1W8. E-mail:

The authors have no conflicts of interest.

Copyright © 2017 Wolters Kluwer Health, Inc. All rights reserved.