To the Editor: Lung cancer leads to the largest number of cancer-associated deaths in the world. As the main means of the spreading of lung cancer, lymph node metastasis is related to the strategy for its treatment and the prognosis of the patients. According to guidelines of the National Comprehensive Cancer Network, once mediastinal lymph nodes have been involved, patients are considered to have reached an advanced stage of lung cancer, and the treatment strategy should be different. Computed tomography (CT) is a common method for the preoperative prediction of lymph node metastasis among lung cancer patients. However, not all abnormal lymph nodes can be recognized preoperatively with CT. Thus, more accurate prediction of lymph node metastasis is important for lung cancer treatment.
With the development of artificial intelligence (AI) in recent years, many researchers have focused on the application of deep learning AI technology for disease diagnosis. The faster region-based convolutional neural network (faster R-CNN) is a popular deep learning model. It integrates feature detection, candidate regional generation, regional image classification, and location refinement into a unified deep convolutional network [Supplementary Figure 1, https://links.lww.com/CM9/B163]. To improve the efficiency of doctors for mediastinal lymph nodes diagnoses based on chest CT of lung cancer patients, we built an AI model for predicting abnormal mediastinal lymph node with the faster R-CNN algorithm. This is significant for assessing tumor staging, thereby making individualized treatment plan to preoperational patients. The application of faster R-CNN algorithm to lymph node diagnosis of lung cancer is still rarely reported at present.
In this study, large numbers of CT images of mediastinal lymph nodes were collected from lung cancer patients to build a CT database. Ethical approval from the institutional Ethics Review Board at the Affiliated Hospital of Qingdao University (No. QYFYKY 2018-10-11-2) was approved. And, the collection of samples and clinical information of the subjects was performed following the informed consent and ethical approval files. The CT database (including training and validation datasets) was established with chest CT images of patients from five medical institutions of China (Affiliated Hospital of Qingdao University, Xuanwu Hospital Affiliated to Capital Medical University, Tangdu Hospital of the Fourth Military Medical University, Second Affiliated Hospital of Harbin Medical University, and Tai’an City Central Hospital) from January 2015 to September 2018. A total of 635 patients were included. The CT images were diagnosed and annotated by four radiologists, including three radiologists with more than 10 years of working experience and one radiology director with 30 years of working experience [Figure 1A]. In a case in which the diagnosis results from the three radiologists were inconsistent, the final decision was made by the radiology director. CT images were collected from the first layer to the last one of each target lymph node. The database involved 16,260 CT images of lymph nodes (including 8030 images of abnormal lymph nodes and 8230 images of normal lymph nodes). Through the randomly sampled method, 13,000 images (80.0%) were treated as the training dataset and the other 3260 ones (20.0%) were treated as the validation dataset [Supplementary Figure 2A, https://links.lww.com/CM9/B163]. Finally, we performed an assessment to compare the efficiency of our model in diagnosing mediastinal lymph nodes with radiologists. We established an assessment cohort with another 50 lung cancer patients from Affiliated Hospital of Qingdao University from January 2020 to September 2021, who underwent lung surgery and systemic lymph node resection. Informed consent was obtained from each patient for the use of their data without violating privacy. The flow charts of assessment cohort were shown in Supplementary Figure 2B, https://links.lww.com/CM9/B163.
Using the training database, we constructed a lymph node detection model based on faster R-CNN. The objective function is . Details of training of the model are listed in Supplementary Text, https://links.lww.com/CM9/B163. The training parameters are listed in Supplementary Table 1, https://links.lww.com/CM9/B163. The output samples of our model were shown in Figure 1B. According to the receiver operator characteristic curve (ROC) of diagnosis efficiency, the sensitivity and specificity of our algorithm achieved 82.5% (95% CI, 75.5–89.6%) and 95.1% (95% CI, 88.0–100.0%), respectively, in the validation dataset [Figure 1C]. The area under the curve (AUC) of this algorithm was 0.920 (95% CI, 0.880–0.970). The precision-recall curve (PRC) also exhibited a good learning effect of our model [Figure 1D]. The precision of PRC was 81.1% (95% CI, 75.9–88.3%), based on the recall of 98.1% (95% CI, 93.0–100%). The AUC was 0.915 (95% CI, 0.855–0.966). According to the loss function of model training, the sensitivity increased obviously following training times [Supplementary Figure 3, https://links.lww.com/CM9/B163].
Subgroup analyses were conducted based on two characteristics—the short diameter and the location of lymph nodes [Supplementary Table 2, https://links.lww.com/CM9/B163]. In terms of different mediastinal lymph node stations, our diagnostic model showed non-uniform efficiency. It showed the largest AUC of 0.949 for station 7, while a small AUC of 0.751 and 0.816 for station 3P and station 8, respectively. Besides, the model showed high diagnosis efficiencies for stations 5 and 6, with the AUC of 0.919 and 0.917, respectively [Figure 1E]. Lymph nodes were divided into two subgroups according to the short diameter: ≤10 mm and >10 mm. No statistically significant differences in sensitivity or specificity were detected between these two subgroups (P = 0.351) [Figure 1F]. In the ≤10 mm group, the sensitivity and specificity were 85.3% and 82.1%, respectively. The AUC was 0.905. And in the >10 mm group, the sensitivity and specificity were 72.9% and 98.3%, respectively. The AUC was 0.901.
Finally, we established an assessment cohort including 50 lung cancer patients who underwent surgery and had a pathological diagnosis. Among the 50 patients, 20 were males, and the average age was 60.52 ± 9.27 years old. Pathology examination showed lymph node metastasis in 9 cases. This cohort was used to compare the efficiency of four senior radiologists with our proposed algorithm. Pathological results were used as the diagnostic criteria. The radiologists’ performance showed an AUC of 0.812, with a sensitivity of 0.891 and a specificity of 0.798. The AI model showed an AUC of 0.823, with a sensitivity of 0.908 and a specificity of 0.811 [Figure 1G]. However, the time it took for the diagnosis was 20–40 s/case for the model, which was much shorter than that radiologists took (300–800 s/case) [Supplementary Table 3, https://links.lww.com/CM9/B163].
In our work, the feasibility of a deep learning model in clinical practice has been exhibited in the diagnosis of lymph node metastasis in patients with lung malignancy. Our faster R-CNN model showed significantly higher speed and accuracy than radiologists in finding out tumor-involved mediastinal lymph nodes.
Deep learning algorithms applied for nodule diagnosing have been reported in various diseases such as lung, breast, cervical, prostate tumor, and glioma. To improve the training efficiency, we used faster R-CNN, which was proposed by Ross B. Girshick in 2016.[3–5] The algorithm integrates feature extraction, proposal collection, bounding box region, and classification in a unified network, which greatly improves the comprehensive performance, especially the detection speed. Compared with traditional CNN models, this algorithm is trained more easily, learns more deep features, shows stronger generalization ability, and is more suitable to deal with complex scenes. In the process of CT diagnosing, doctors go through the CT images layer by layer to detect and classify lymph nodes, then determine the locations of abnormal lymph nodes. This process is similar to that of faster R-CNN.
Compared with other studies, our deep learning model shows higher efficiency. Besides, our model was built with a larger sample size collected from five medical centers, and applies to all stages of lung cancer, with higher accuracy and lower false-positive and false-negative results. To improve the efficiency of learning, we used two-classification method in the model training which was different from previous studies. During the training period, we provided mark about abnormal samples and normal ones. When a lymph node was detected, the model assessed its probability of either the abnormal lymph node, or the normal one, and then output the higher probability value (for example, a lymph node was assessed to be with 55% probability of being abnormal, while 45% probability of being normal, the model then indicated that its probability of being abnormal was 0.55). The two-classification method improved the training efficiency and accuracy of the model.
Different from other studies, we used subgroup analysis. Our model showed different detection efficiency for each station of lymph nodes. It is more likely to be related to the different sample sizes of lymph nodes in each group. Besides, high detecting efficiencies were observed both in small (≤10 mm) lymph nodes and in larger ones (>10 mm). It indicated that the efficiency of our model needs to be further improved.
There are shortcomings to our research. First, the gold standard pathology was not used to diagnose metastatic lymph nodes in the training and validation dataset. Instead of it, we built a uniform imaging diagnostic criterion for the training database because we aimed to compare our faster R-CNN model with radiologists’ diagnostic efficiency. The other drawback comes from the mechanism of faster R-CNN itself. Compared with the logical method of human learning, our learning strategy of CNN is to connect the final result to image characteristics directly, which needs training through a large sample size to achieve high efficiency and accuracy. To improve the learning effect, we labeled the lymph nodes as normal or abnormal, as well as classified the lymph nodes by different stations and diameters. Despite all this, our model is inefficient for those lesions without distinct characteristics.
We thank International Science Editing (http://www.internationalscienceediting.com) for editing this manuscript.
This work was supported by a grant from the Natural Science Foundation of Shandong Province (No. ZR2020HM234).
Conflicts of interest
1. Zhong W, Yang X, Bai J, Yang J, Manegold C, Wu Y. Complete mediastinal lymphadenectomy: the core component of the multidisciplinary therapy in resectable non-small cell lung cancer. Eur J Cardiothorac Surg
2008; 34:187195. doi: 10.1016/j.ejcts.2008.03.060.
2. Takano N, Ariyasu R, Koyama J, Sonoda T, Saiki M, Kawashima Y, et al. Improvement in the survival of patients with stage IV non-small-cell lung cancer: experience in a single institutional 1995–2017. Lung Cancer
2019; 131:6977. doi: 10.1016/j.lungcan.2019.03.008.
3. Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell
2017; 39:11371149. doi: 10.1109/TPAMI.2016.2577031.
4. Christiansen P, Nielsen LN, Steen KA, Jorgensen RN, Karstoft H. DeepAnomaly: combining background subtraction and deep learning for detecting obstacles and anomalies in an agricultural field. Sensors (Basel)
2016; 16:1904doi: 10.3390/s16111904.
5. Spasov S, Passamonti L, Duggento A, Lio P, Toschi N. Alzheimer's disease neuroimaging I. A parameter-efficient deep learning approach to predict conversion from mild cognitive impairment to Alzheimer's disease. Neuroimage
2019; 189:276287. doi: 10.1016/j.neuroimage.2019.01.031.