Machine Learning Applications in Orthopaedic Imaging : JAAOS - Journal of the American Academy of Orthopaedic Surgeons

Journal Logo

On the Horizon From the ORS

Machine Learning Applications in Orthopaedic Imaging

Wang, Vincent M. PhD; Cheung, Carrie A. BS; Kozar, Albert J. DO; Huang, Bert PhD

Author Information
Journal of the American Academy of Orthopaedic Surgeons 28(10):p e415-e417, May 15, 2020. | DOI: 10.5435/JAAOS-D-19-00688
  • Free

Machine learning (ML) is a form of artificial intelligence in which computers learn from data without being explicitly programmed to complete a task.1 One form of ML is deep learning, which trains artificial neural networks made up of multiple layers (Table 1). These layers “learn” features that optimally represent the input data for the problem posed, transforming the input data to an output.2 ML methods have garnered interest within health care, with applications in bioinformatics, medical imaging, pervasive sensing for mental health, medical informatics, and public health. In recent years, there has been considerable growth of ML applications for the analysis of medical images (radiograph, MRI, and ultrasonography images)1 and this methodology appears to be a promising tool for the healthcare industry.

Table 1 -
Term Definition
Artificial intelligence (AI) A field of computer science that designs systems to do tasks that typically requires human intelligence.1
Machine learning (ML) A field of AI in which computers learn from data without being explicitly programmed.1
Deep learning A form of ML that uses multiple layers that process data for tasks such as feature extraction and classification.1
Neural network Algorithms that mimic the neural networks in the brain, including an input of raw data, an output of the results after the data are processed, and hidden layers between the input and output that extract patterns within the data. Each layer is made up of weighted nodes that receive input from some of the nodes in the previous layer.1
Convolutional neural network (CNN) A neural network that transforms the input data using spatial filters, that is, convolutional operations.2
Fully convolutional neural network (FCN) A CNN where each node receives input from all the nodes in the previous layer.2

In orthopaedics, and in other specialties, image analysis and clinical interpretation can be subjective and highly dependent on reviewer expertise. The appeal of ML lies principally in its objective nature, its ability to analyze large data sets, and its reproducibility.1,3,4 ML methods have been developed to facilitate clinical decision making, with one study reporting an increase in specificity in detecting anterior cruciate ligament tears from MRI when experienced physicians were provided with the ML model's prediction during medical imaging interpretation.3 A reliable, robust ML tool can potentially contribute to more efficient clinical protocols, reduce healthcare expenses, and ultimately improve patients' quality of life. Although ML studies in orthopaedic imaging are now emerging,5 such approaches have already provided encouraging results in the detection and classification of breast lesions6 in those requiring or not requiring surgery and the accurate classification of diabetes from infrared images of the iris.7

ML does multiple tasks that aid in image analysis, with the most commonly used tasks being detection, classification, and segmentation. In image detection, one or more objects (such as an anatomical landmark) are localized spatially or temporally.2 Yang et al8 used a detection algorithm to localize anatomic landmarks on 3D MRI of a distal femur. The authors' approach circumvents limitations of manual landmark identification, which can lack accuracy and be time consuming. In their method, global shape and local surface curvature from 3D images are used to detect landmarks, offering an effective and efficient detection method.8 Localization of specific anatomic landmarks could facilitate surgical planning, and the algorithm can also be integrated with other ML tasks, such as a preprocessing step for both classification and segmentation.2

Image classification (in ML) assigns a category to images.2 For example, if the input is a medical examination, the output may be positive or not for a particular disease. Antony et al9 developed a fully convolutional neural network (Table 1) to automatically extract knee joint images from radiographs and subsequently trained convolutional neural networks (CNNs, Table 1) to predict the severity of knee osteoarthritis (OA) using Kellgren and Lawrence (KL) grades. Traditionally, when classifying the severity of knee OA using KL grading of radiographs, clinicians identify key pathological features such as joint space narrowing and osteophyte formation.9 These features served as the basis of early attempts of classification by computer-aided analysis. However, supplying these features to ML algorithms requires a high level of knowledge and can exclude other potentially useful cues identified by the algorithm.9 Owing to these limitations, newer ML approaches have focused on methods that “learn” features through image data instead of being supplied with them.9,10 Examples using classification tasks have been proven successful in distinguishing anatomic muscle types (biceps brachii, tibialis anterior, gastrocnemius medialis, and rectus femoris) within an ultrasonography image,5 sex recognition of muscles using textural descriptors from ultrasonography images,5 and in the prediction of walking ability after traumatic spinal cord injury.4

In image segmentation, there is typically an input of one or more images where a set of voxels (points defined in a three-dimensional space) or pixels (which are points defined in a two-dimensional space) that make up either the contour or the interior of one or more objects of interest are identified.2 Prasoon et al,11 studied tibial osteoarthritis patients by segmentation of articular cartilage within MRI scans. In contrast to a “slice-by-slice” manual segmentation approach, the authors implemented a voxel classification method using triplanar 2D CNNs that “learn” the image features autonomously. Segmentation of the cartilage is necessary to quantify volume, thickness, surface area, and curvature.12 This morphologic analysis, in turn, can be used in clinical procedures such as matching native joint anatomy during osteochondral grafting procedures. Results demonstrated a lower computational cost and improved segmentation performance, with improved sensitivity, specificity, and accuracy not only in manual segmentation but also in other modern segmentation methods.11 MRI segmentation also offers orthopaedic surgeons a method to potentially facilitate surgical planning by outlining areas of risk (eg, femoral cartilage regions which can be unintentionally damaged during knee arthroscopy).13

For training and testing a supervised ML model, image data can be divided into three datasets: a training set, validation set, and testing set. Training and validation sets are both labeled with an output available to the learning model, whereas the testing set labels are hidden from the model. The model is trained using the training set (eg, “normal” or pathologic tissue), where it identifies important features (eg, corners, blobs, and edges), which the model will subsequently use to analyze a new set of data. During model training, the validation set is used to evaluate how well the model fits the data and to tune hyperparameters so that they best fit the characteristics of the target task. The performance of classification models is commonly quantified in confusion matrices that distinguish false-positive and false-negative errors from true-positive and true-negative predictions. The model achieving high accuracy for both the training and the validation set indicates that the model hyperparameters are appropriate for the target task, and the learned model is likely to generalize to new examples it has not previously seen, for example, reliably classifying the test set and future examples once the model is deployed.

Although ML applications in orthopaedic imaging are emerging, several advantages associated with ML over current image analysis strategies warrant consideration. Perhaps foremost among these, ML can identify features that may not be detectable to the human eye.1 Similarly, ML can efficiently detect features that may appear to offer limited value to the reviewer but offer strong predictive value for classifying the input data. For example, Carballido-Gamio et al 14 reported that incorporating an ML algorithm (using features extracted from statistical multiparametric modeling) into a standard clinical hip fracture prediction model based on dual-energy x-ray absorptiometry (DEXA) measures of bone mineral density improved hip fracture discrimination relative to the clinical standard. In addition, if adequately robust, an ML algorithm offers an automated method for image analysis that can potentially accommodate variances in images resulting from different operators or machines. However, robust algorithms can be difficult to construct, particularly in studies with a small number of patients or images because overfitting can occur, thereby reducing reliability and applicability of the algorithm if applied on a larger scale.1 Interest in ML applications in orthopaedic imaging is expected to grow rapidly because ML methods continue to be refined and larger sets of data become available in the form of public medical image databases that can be used in retrospective studies.

Balancing the intrigue and novelty associated with implementing ML into clinical practice, there exist concerns regarding potential liability for physicians.15 Similar to human beings, even well-trained models may lead to errors, the consequences of which may include patient injury. Hence, as ML becomes more commonly incorporated into clinical practice, liability laws will need to evolve,15 and as with any new technology, thorough testing and validation is required before clearance for its use.


The authors acknowledge funding from NIH Grant AR 63144 (V.M.W.) and Virginia Tech's Data & Decisions Destination Area and Institute for Critical Technology and Applied Science (B.H., A.J.K., and V.M.W.).


References printed in bold type are those published within the past 5 years.

1. Pesapane F, Codari M, Sardanelli F: Artificial intelligence in medical imaging: Threat or opportunity? Radiologists again at the forefront of innovation in medicine. Eur Radiol Exp 2018;2:35.
2. Litjens G, Kooi T, Bejnordi BE, et al.: A survey on deep learning in medical image analysis. Med Image Anal 2017;42:60-88.
3. Bien N, Rajpurkar P, Ball RL, et al.: Deep-learning-assisted diagnosis for knee magnetic resonance imaging: Development and retrospective validation of MRNet. PLoS Med 2018;15:e1002699.
4. DeVries Z, Hoda M, Rivers C, et al.: Development of an unsupervised machine learning algorithm for the prognostication of walking ability in spinal cord injury patients. Spine J 2019. doi: 10.1016/j.spinee.2019.09.007.
5. Katakis S, Barotsis N, Kastaniotis D, et al.: Muscle type and gender recognition utilising high-level textural representation in musculoskeletal ultrasonography. Ultrasound Med Biol 2019;45:1562-1573.
6. Bahl M, Barzilay R, Yedidia AB, Locascio NJ, Yu L, Lehman CD: High-risk breast lesions: A machine learning model to predict pathologic upgrade and reduce unnecessary surgical excision. Radiology 2018;286:810-818.
7. Samant P, Agarwal R: Machine learning techniques for medical diagnosis of diabetes using iris images. Comput Methods Programs Biomed 2018;157:121-128.
8. Yang D, Zhang S, Yan Z, Tan C, Li K, Metaxas D: Automated anatomical landmark detection on distal femur surface using convolutional neural network (vol. 2015). IEEE 12th International Symposium on Biomedical Imaging (ISBI), 2015. doi: 10.1109/ISBI.2015.7163806.
9. Antony J, McGuinness K, Moran K, O'Connor NE: Automatic detection of knee joints and quantification of knee osteoarthritis severity using convolutional neural networks. 2017. International Conference on Machine Learning and Data Mining in Pattern Recognition, 376-390.
10. Hammett E, Iliff G, Rezvani S, Huang B, Kozar A, Wang VM: “The detection of patellar tendinopathy using machine learning analysis of ultrasound images.”. Trans Orthop Res Soc 2018;43:2061.
11. Prasoon A, Petersen K, Igel C, Lauze F, Dam E, Nielsen M: Deep feature learning for knee cartilage segmentation using a triplanar convolutional neural network. Med Image Comput Comput Assist Interv 2013;16:246-253.
12. Fripp J, Crozier S, Warfield SK, Ourselin S: Automatic segmentation and quantitative analysis of the articular cartilages from magnetic resonance images of the knee. IEEE Trans Med Imaging 2010;29:55-64.
13. Antico M, Sasazawa F, Dunnhofer M, et al.: Deep learning-based femoral cartilage automatic segmentation in ultrasound imaging for guidance in robotic knee arthroscopy. Ultrasound Med Biol 2020:46:422-435.
14. Carballido-Gamio J, Yu A, Wang L, et al.: Hip fracture discrimination based on statistical multi-parametric modeling (SMPM). Ann Biomed Eng 2019;47:2199-2212.
15. Price WN, Gerke S, Cohen IG: Potential liability for physicians using artificial intelligence. JAMA 2019;322:1765-1766.
Copyright 2020 by the American Academy of Orthopaedic Surgeons.