The aims of this study were to present a deep learning approach for the automated classification of multiple sclerosis and its mimics and compare model performance with that of 2 expert neuroradiologists.
Materials and Methods
A total of 268 T2-weighted and T1-weighted brain magnetic resonance imagin scans were retrospectively collected from patients with migraine (n = 56), multiple sclerosis (n = 70), neuromyelitis optica spectrum disorders (n = 91), and central nervous system vasculitis (n = 51). The neural network architecture, trained on 178 scans, was based on a cascade of 4 three-dimensional convolutional layers, followed by a fully dense layer after feature extraction. The ability of the final algorithm to correctly classify the diseases in an independent test set of 90 scans was compared with that of the neuroradiologists.
The interrater agreement was 84.9% (Cohen κ = 0.78, P < 0.001). In the test set, deep learning and expert raters reached the highest diagnostic accuracy in multiple sclerosis (98.8% vs 72.8%, P < 0.001, for rater 1; and 81.8%, P < 0.001, for rater 2) and the lowest in neuromyelitis optica spectrum disorders (88.6% vs 4.4%, P < 0.001, for both raters), whereas they achieved intermediate values for migraine (92.2% vs 53%, P = 0.03, for rater 1; and 64.8%, P = 0.01, for rater 2) and vasculitis (92.1% vs 54.6%, P = 0.3, for rater 1; and 45.5%, P = 0.2, for rater 2). The overall performance of the automated method exceeded that of expert raters, with the worst misdiagnosis when discriminating between neuromyelitis optica spectrum disorders and vasculitis or migraine.
A neural network performed better than expert raters in terms of accuracy in classifying white matter disorders from magnetic resonance imaging and may help in their diagnostic work-up.