Secondary Logo

Journal Logo

Thoracic Imaging

Integrative Predictive Models of Computed Tomography Texture Parameters and Hematological Parameters for Lymph Node Metastasis in Lung Adenocarcinomas

Chen, Wenping MD; Xu, Mengying MS; Sun, Yiwen MD; Ji, Changfeng MD; Chen, Ling MD§; Liu, Song MD; Zhou, Kefeng MD; Zhou, Zhengyang MD, PhD

Author Information
Journal of Computer Assisted Tomography: 3/4 2022 - Volume 46 - Issue 2 - p 315-324
doi: 10.1097/RCT.0000000000001264

Abstract

Lung carcinoma is the second most commonly diagnosed cancer and the leading cause of cancer-related deaths worldwide.1 The most common path of metastasis in patients with nonsmall cell lung cancer (NSCLC) is through the lymph node (LN). The National Comprehensive Cancer Network guidelines (2020) demonstrated that the presence of mediastinal LN metastasis has a profound impact on prognosis and treatment decisions.2 For medically operable diseases, resection is the preferred local treatment modality. A thorough dissection of metastatic mediastinal LN during surgery plays a key role in improving disease-free survival and overall survival rates among the patients.3 Therefore, it is necessary to accurately evaluate the preoperative LN metastasis in NSCLC.

Currently, there are many methods to preoperatively assess the LN status in NSCLC. However, invasive methods, including endobronchial ultrasonography-guided transbronchial needle aspiration and thoracoscopy, are not routinely performed.4,5 Noninvasive methods include computed tomography (CT), positron emission tomography–CT, and magnetic resonance imaging. The misdiagnosis and false-negative rates are higher in positron emission tomography–CT for diagnosing LN metastasis relative to the final pathological staging after complete nodal dissection (the criterion standard).6 The time required for performing a magnetic resonance scan is long. Lymph nodes greater than 1 cm in short-axis diameter are considered metastatic nodes. However, the accuracy of preoperative CT scanning in distinguishing LN status is too low for sufficient preoperative staging.7,8 Lymph node metastasis is misdiagnosed in CT scan analysis because of the presence of normal-sized N2 nodes.

Most studies have reported a significant association between LN metastasis and the radiological features of the primary tumor. Zhao et al9 reported that tumor size greater than 2.65 cm was an independent predictor of LN metastasis. Moreover, several studies using texture analysis describe a correlation between primary tumor and LN metastasis. Bayanati et al10 confirmed the potential of CT texture analysis for accurately differentiating malignant from benign mediastinal nodes in lung cancer. In addition, several studies reported that hematological inflammatory biomarkers could be used to predict the tumor–node–metastasis stage of lung cancer.11 Xu et al12 showed that the neutrophil-to-lymphocyte ratio (NLR) in lung cancer may be an independent predictive marker for the N stage. However, only a few studies have established an integrative model based on radiological features, texture, and hematological parameters to predict LN metastasis.

Recently, with the widespread application of regression models and the development of machine learning algorithms, multivariate model evaluation methods have also matured. Therefore, our study aimed to incorporate the radiological features, texture, and hematological parameters to establish predictive models for LN metastasis for lung adenocarcinomas.

MATERIAL AND METHODS

Patients

This retrospective study was approved by the local ethics committee, and the requirement for informed consent was waived. Patients who were at our hospital between February 2017 and April 2019 met the following criteria were collected and analyzed retrospectively in our hospital. The inclusion criteria were as follows: (1) patients who underwent radical resection of lung cancer with systematic LN dissection; (2) postoperative pathology confirmed as lung adenocarcinoma; (3) complete preoperative information on clinical data and CT images; and (4) single lesion. The exclusion criteria were as follows: (1) image quality was poor and could not be used for analysis; (2) patients who had received radiotherapy or/and chemotherapy before surgery; (3) patients had a history of malignancy in other sites; and (4) patients who underwent CT examination more than 1 month before the surgery (Fig. 1).

F1
FIGURE 1:
Flowchart shows the patient selection process.

Finally, 207 patients (91 men and 116 women; age range, 36–85 years; mean age, 60.5 years) were enrolled in this study. The patients were divided into 3 cohorts (1 training cohort and 2 validation cohorts) in a ratio of 3:1:1.

Hematological Test

Hematological parameters, including white blood cell (WBC) count, lymphocyte count, neutrophil count, monocyte count (MONO), red blood cell count, platelet count, hemoglobin, red blood cell distribution width (RDW), C-reactive protein, albumin (ALB), and globulin, were recorded within 2 weeks before the surgery. Based on the previously mentioned parameters, the hemoglobin/RDW ratio, NLR, lymphocyte-to-monocyte ratio (LMR), platelet-to-lymphocyte ratio, C-reactive protein-to-albumin ratio (CAR), and platelet-to-monocyte ratio (PMR) were calculated.

Computed Tomography Examination

Chest CT images were acquired using 16- or 64-row multidetector spiral CT (VCT 64 or Discovery HD 750, GE Healthcare; iCT 256 or Ingenuity Flex 16, Philips Healthcare; or uCT780, United Imaging, China). The CT scan parameters were as follows: tube voltage, 120 kV; tube current, automatic; rotation time, 0.7 seconds; and matrix, 512 × 512. The CT images were scanned at 5-mm section thickness and reconstructed with a 1.25-mm section thickness. The flow diagram of our study is shown in Figure 2.

F2
FIGURE 2:
Workflow of key steps in our study. Polygonal regions of interest on the axial CT section are manually drawn. Hematological features are collected. Computed tomography characteristics and texture parameters are extracted from the defined tumor regions of CT images. The logistic regression and machine learning algorithms are used to construct predictive models. The prediction models are established incorporating CT characteristics, texture parameters, and hematological parameters. The performance of the multivariate models is evaluated using the ROC curve. NEU, neutrophil count; LYM, lymphocyte count; RBC, red blood cell; PLT, platelet count; CRP, C-reactive protein; LASSO, least absolute shrinkage and selection operator. Figure 2 can be viewed online in color at www.jcat.org.

Imaging Analysis

Computed Tomography Morphological Characteristics

Readers 1 and 2 (both with 5 years of experience in chest CT diagnosis) evaluated each lesion on the CT images together. Their inconsistent results were confirmed in consensus through consultation. All the CT images were reviewed using the lung and mediastinal window settings in the image processing software. Computed tomography morphological characteristics included: (1) border; (2) attenuation; (3) lobulation; (4) spiculation; (5) calcification; (6) vascular convergence sign; (7) air bronchogram sign; (8) vacuole sign; (9) nodule/mass type; (10) adjacent pleural thickening; (11) pleural indentation; (12) obstructive pulmonary emphysema; (13) peripheral fibrosis; (14) pleural effusion; (15) single enlarged LN; (16) multiple enlarged LNs; and (17) calcified LNs.

Quantitative CT Value Parameters

Quantitative CT values were measured by reader 1 to avoid calcification, vacuoles, cavities, and bronchial shadows on the maximal section of the lesions. The mean, maximum, and minimum CT attenuations in the nonenhanced phase were recorded as CTmean, CTmax, and CTmin, respectively. The corresponding standard deviation (SD) value was recorded as SD1. In addition, the long and short diameters of the lesions were measured and recorded. To determine interobserver reproducibility, reader 2 repeated the previously mentioned procedure.

Computed Tomography Texture Parameters

Polygonal regions of interest in the nonenhanced CT images were manually drawn along the margin of the lesion on the largest cross-section by reader 1 to avoid calcification, vacuoles, cavities, and bronchial shadows. Texture parameters were as follows: (1) the first-order features included the mean, SD2, max frequency, mode, minimum, maximum, cumulative percentiles (5th, 10th, 25th, 50th, 75th, and 90th percentiles), skewness, kurtosis, entropy, and histogram width; (2) the second-order features were from the gray-level co-occurrence matrix (GLCM) and included entropy GLCM, energy GLCM, inertia GLCM, and variance GLCM. To confirm interobserver reproducibility, reader 2 repeated the previously mentioned procedure.

Development, Performance, and Testing of Multivariate Models

First, in the training cohort, variables with significant differences (P < 0.05) in the univariate analysis were used for multivariate binomial logistic regression. The Hosmer-Lemeshow test was used to measure the goodness of fit. A multivariate model was applied to the 2 validation cohorts.

Next, if significance (P < 0.05) was met in the univariate analysis of the training cohort, for dimension reduction, the least absolute shrinkage and selection operator analysis was performed. The retained features were input into our in-house software programmed using the Python Scikit-learn package (Python version 3.8, Scikit-learn version 0.22.2, http://scikit-learn.org/). The machine learning classifiers of support vector machine (SVM), Naive Bayes (NB), and random forest (RF) were used to generate multivariate models. The ratio of cases in the training and validation cohorts was 3:1:1. In the training phase, a popular data preprocessing method in machine learning—Synthetic Minority Oversampling Technique—was used to address the class imbalance problem. The models were evaluated by repeated stratification (K = 5) cross-testing. Multivariate models based on machine learning classifiers were applied to the 2 validation cohorts. The performance of the multivariate models was evaluated using a receiver operating characteristic (ROC) curve and the values for the area under of curve (AUC) value, diagnostic sensitivity, specificity, and accuracy were determined. Furthermore, to evaluate the clinical usefulness of the multivariate model, a decision curve analysis was performed by calculating the net benefits for a range of threshold probabilities in the 2 validation cohorts.

Statistical Analysis

Statistical analyses were performed using SPSS (version 22.0, Microsoft Windows x64; SPSS) and MedCalc Statistical Software (version 11.4.2.0, MedCalc Software bvba; http://www.medcalc.org; 2011), and a 2-tailed P value less than 0.05 was defined as statistically significant. The χ2 test or Fisher exact test (n < 5) was used for categorical variables. Continuous variables, including hematological parameters, CT value parameters, texture parameters, and the long and short diameters of the lesion, were tested for their normality using the Shapiro-Wilk test, and accordingly, the Mann-Whitney U test was used for nonnormally distributed variables. The interobserver agreement of CT values and texture parameters was estimated using the intraclass correlation coefficient (ICC; 0.000–0.200: poor; 0.201–0.400: fair; 0.401–0.600: moderate; 0.601–0.800: good; and 0.801–1.000: excellent).

RESULTS

Patient Characteristics

Among the 207 lung adenocarcinoma cases, 50 (24.2%) had LN metastasis, while 157 (75.8%) did not. As shown in Table 1, a statistically significant difference was found in sex between patients with and without LN metastasis in the training cohort (P < 0.05). No significant differences in age were found in the training cohort (P > 0.05). No significant differences in sex or age were found between the 2 validation cohorts (all P > 0.05).

TABLE 1 - Demographic and CT Morphological Characteristics of Patients With Lung Adenocarcinomas
Variable Training Cohort Validation Cohort One Validation Cohort Two
N− N+ P N− N+ P N− N+ P
Age, y 0.626 0.987 0.761
 <60 44 14 18 4 13 3
 ≥60 44 21 17 3 21 5
Sex 0.021* 0.413 0.433
 Female 60 16 13 4 20 3
 Male 28 19 22 3 14 5
Pleural indentation 0.048* 0.244 1.000
 Absent 28 5 10 1 6 1
 Present 60 30 25 6 28 7
Pleural thickening 0.003* 0.631 0.237
 Absent 69 18 27 4 31 6
 Present 19 17 8 3 3 2
Air bronchogram 0.016* 0.353 0.238
 Absent 25 18 7 2 15 6
 Present 63 17 28 5 19 2
Attenuation 0.003* 0.654 0.173
 GGO 11 1 4 0 2 0
 Part solid 30 4 15 1 15 1
 Solid 47 30 16 6 17 7
*P < 0.05 was considered statistically significant.
GGO indicates ground-glass opacity.

Univariate Analyses

Among the CT qualitative parameters, attenuation, pleural indentation, pleural thickening, and air bronchogram were significantly different between patients with and without LN metastasis in the training cohort (all P < 0.05; Table 1, Fig. 3).

F3
FIGURE 3:
Typical morphological features of lung adenocarcinomas on CT images. A, Ground-glass nodule of the right upper lobe without solid component (white arrow). B, Part solid nodule of the right lower lobe (white arrow). C, Solid mass of the right lower lobe with 2 linear pleural tags (black arrows) and spiculation sign (white arrow). D, Spiculated homogeneous solid mass of right lower lobe with vacuoles (white arrow). E, Irregular mass of the left upper lobe with large cavitation (white arrow). F, Irregular mass of the right upper lobe with lobulated margins and air bronchiolograms (white arrow).

Among the CT quantitative parameters, there were significant differences in the mean CT attenuation, minimum CT attenuation, SD1, long diameter, and short diameter between the different LN statuses in the training cohort (all P < 0.05).

Among the texture parameters, 24 of 35 were significantly different between patients with and without LN metastasis in the training cohort (all P < 0.05; Table 2). There were significant differences in values of MONO, RDW, NLR, LMR, and PMR between patients with and without LN metastasis in the training cohort (Table 3).

TABLE 2 - Univariate Analysis of Quantitative CT and Texture Parameters in the Training Cohort
Variable N− N+ P
CTmean, HU −16.85 (−202.40 to 30.88) 31.01 (8.76 to 49.51) <0.001*
CTmin, HU −811.00 (−1024.00 to −198.25) −163.00 (−330.00 to −100.00) <0.001*
SD1 185.73 (64.83 to 273.49) 68.35 (47.69 to 96.16) <0.001*
Long diameter, cm 2.26 (1.60 to 3.00) 2.76 (2.00 to 3.75) 0.018*
Short diameter, cm 1.60 (1.30 to 2.28) 2.49 (1.50 to 3.03) 0.003*
Mean, HU 20.89 (−146.96 to 37.49) 35.09 (28.10 to 57.51) 0.002*
SD2 86.77 (62.36 to 192.88) 60.15 (51.06 to 79.43) <0.001*
Max frequency 5.00 (3.00 to 8.00) 11.00 (6.00 to 16.00) <0.001*
Mode, HU 7.5 (−175.75 to 44.50) 37.00 (9.00 to 64.00) 0.010*
Min, HU −306.50 (−724.50 to −164.25) −177.00 (−256.00 to −125.00) 0.001*
Percentile 5th, HU −115.00 (−516.50 to −57.50) −56.00 (−98.00 to −37.00) <0.001*
Percentile 10th, HU −72.00 (−433.25 to −36.25) −33.00 (−66.00 to −19.00) 0.001*
Percentile 25th, HU −18.50 (−317.25 to 0.75) −1.00 (−19.00 to 18.00) 0.002*
Percentile 50th, HU 23.50 (−115.25 to 40.00) 34.00 (27.00 to 59.00) 0.008*
Area, mm2 154.02 (84.78 to 260.24) 318.39 (168.56 to 580.39) <0.001*
Max diameter, mm 18.26 (13.51 to 26.00) 25.27 (16.82 to 35.22) 0.004*
Histogram width, HU 217.00 (148.00 to 495.50) 153.00 (122.00 to 215.00) <0.001*
Entropy GLCM 10 7.42 (5.55 to 8.23) 8.63 (7.63 to 9.26) <0.001*
Entropy GLCM 11 7.60 (5.90 to 8.29) 8.62 (7.77 to 9.33) <0.001*
Entropy GLCM 12 7.46 (5.80 to 8.24) 8.57 (7.69 to 9.18) <0.001*
Entropy GLCM 13 7.61 (6.03 to 8.42) 8.72 (7.88 to 9.45) <0.001*
Energy GLCM 10† 6.33 (3.54 to 21.68) 2.87 (1.81 to 5.19) <0.001*
Energy GLCM 11† 5.51 (3.48 to 17.47) 2.91 (1.82 to 4.68) <0.001*
Energy GLCM 12† 6.00 (3.56 to 18.25) 3.05 (1.93 to 5.10) <0.001*
Energy GLCM 13† 5.41 (3.21 to 15.95) 2.60 (1.61 to 4.66) <0.001*
Inertia GLCM 10 270.55 (204.91 to 515.96) 198.34 (150.62 to 336.28) 0.013*
Variance GLCM 10 150.90 (110.97 to 237.46) 124.04 (87.15 to 165.90) 0.008*
Variance GLCM 11 167.42 (111.73 to 237.04) 123.72 (87.04 to 200.13) 0.027*
Variance GLCM 13 170.53 (118.31 to 260.85) 132.15 (95.72 to 195.09) 0.030*
The data are presented as median with (1st quartile to 3rd quartile).
*P < 0.05 was considered statistically significant.
†×10−3.
SD1, standard deviation in quantitative CT parameters; SD2, standard deviation in texture parameters.

TABLE 3 - Univariate Analysis of Hematological Parameters in the Training Cohort
Variable N− N+ P
MONO count, 109/L 0.30 (0.30 to 0.48) 0.40 (0.30 to 0.60) 0.009*
RDW, % 12.85 (12.50 to 13.30) 13.20 (12.80 to 13.80) 0.020*
NLR 1.63 (1.25 to 2.19) 1.94 (1.43 to 2.79) 0.042*
LMR 5.00 (4.04 to 6.92) 3.83 (2.60 to 6.00) 0.008*
PMR 533.33 (436.25 to 789.72) 434.00 (263.75 to 650.00) 0.008*
The data are presented as median with (1st quartile to 3rd quartile).
*P < 0.05 was considered statistically significant.

Among the quantitative CT parameters in the training cohort, CTmean had the highest AUC value (0.719), with a sensitivity of 80.0% and a specificity of 61.4%. The texture parameters using maximum frequency had a good ability to predict LN metastasis with an AUC of 0.736, a sensitivity of 60.0%, and a specificity of 77.3% in the training cohort (Table 4). Platelet-to-monocyte ratio had the highest AUC value of 0.852 for the 5 optimal hematological parameters, with a sensitivity of 57.1% and specificity of 72.7% in the training cohort (Table 5, Fig. 4).

TABLE 4 - The Diagnostic Performance of Quantitative CT and Texture Parameters in the Training Cohort
Variable Cutoff Sensitivity Specificity AUC Accuracy P
CTmean, HU 2.90 80.0% 61.4% 0.719 66.7% <0.001*
CTmin, HU −344.00 77.1% 67.0% 0.716 69.9% <0.001*
SD1 130.80 82.9% 61.4% 0.717 67.5% <0.001*
Long diameter, cm 2.32 71.4% 53.4% 0.638 58.5% 0.018*
Short diameter, cm 2.40 51.4% 81.8% 0.674 73.1% 0.003*
Mean, HU 22.08 85.7% 51.1% 0.676 60.9% 0.002*
SD2 115.83 94.3% 43.2% 0.716 57.7% <0.001*
Max frequency 8.00 60.0% 77.3% 0.736 72.4% <0.001*
Mode, HU 23.00 68.6% 61.4% 0.649 63.4% 0.010*
Min, HU −257.00 77.1% 58.0% 0.699 63.4% 0.001*
Percentile 5th, HU −163.00 91.4% 45.5% 0.709 58.6% <0.001*
Percentile 10th, HU −151.00 94.3% 40.9% 0.698 56.1% 0.001*
Percentile 25th, HU −51.00 91.4% 39.8% 0.675 54.5% 0.002*
Percentile 50th, HU 24.00 85.7% 51.1% 0.654 60.9% 0.008*
Area, mm2 260.54 60.0% 76.1% 0.702 71.5% <0.001*
Max diameter, mm 20.42 71.4% 58.0% 0.667 61.8% 0.004*
Histogram width, HU 246.00 91.4% 46.6% 0.706 59.3% <0.001*
Entropy GLCM 10 8.56 57.1% 84.1% 0.721 76.4% <0.001*
Entropy GLCM 11 8.54 57.1% 84.1% 0.722 76.4% <0.001*
Entropy GLCM 12 8.29 60.0% 79.5% 0.723 74.0% <0.001*
Entropy GLCM 13 8.56 60.0% 83.0% 0.723 76.5% <0.001*
Energy GLCM 10† 3.20 60.0% 80.7% 0.714 74.8% <0.001*
Energy GLCM 11† 3.00 60.0% 83.0% 0.718 76.5% <0.001*
Energy GLCM 12† 5.30 82.9% 56.8% 0.722 64.2% <0.001*
Energy GLCM 13† 2.80 60.0% 83.0% 0.720 76..5% <0.001*
Inertia GLCM 10 198.33 51.4% 78.4% 0.644 70.7% 0.013*
Variance GLCM 10 96.83 40.0% 88.6% 0.654 74.8% 0.008*
Variance GLCM 11 104.89 40.0% 85.2% 0.628 72.3% 0.027*
Variance GLCM 13 107.43 40.0% 81.8% 0.625 69.9% 0.030*
*P < 0.05 was considered statistically significant.
†×103.
SD1 indicates standard deviation in quantitative CT parameters; SD2, standard deviation in texture parameters.

TABLE 5 - The Diagnostic Performance of Hematological Parameters in the Training Cohort
Variable Cutoff Sensitivity Specificity AUC Accuracy P
MONO count, 109/L 0.30 65.7% 54.5% 0.648 57.7% 0.009*
RDW, % 13.30 48.6% 78.4% 0.634 69.9% 0.020*
NLR 2.52 42.9% 84.1% 0.618 72.4% 0.042*
LMR 4.00 57.1% 75.0% 0.653 69.9% 0.008*
PMR 457.50 57.1% 72.7% 0.852 68.3% 0.008*
*P < 0.05 was considered statistically significant.

F4
FIGURE 4:
The histogram shows hematological parameters in different LN status. * P < 0.05 was considered statistically significant. CAR, C-reactive protein-to-albumin ratio; PLR, platelet-to-lymphocyte ratio.

Multivariate Analyses

Variables with significant differences (P < 0.05) in the univariate analysis were subjected to binary logistic regression analysis in the training cohort. The results demonstrated that pleural thickening (P = 0.013), percentile 25th (P = 0.033), entropy GLCM 10 (P = 0.019), RDW (P = 0.012), and LMR (P = 0.049) were independent risk factors associated with LN metastasis (Table 6). These 5 independent risk factors were chosen to establish the predictive model. The ROC curve results showed that the AUC of the predictive model was 0.929 (Fig. 5). The results were higher than those of the single-factor parameters. The model was tested in the 2 validation cohorts and values of AUCs were 0.886 and 0.871, respectively (Table 7, Supplementary Table 1, https://links.lww.com/RCT/A136). Decision curve analysis results for the multivariate models in the 2 validation cohorts are plotted in Figure 6.

TABLE 6 - Binomial Logistic Regression Results for Prediction of LN Metastasis in Lung Adenocarcinomas
Variable B SE Wald df P
Pleural thickening 2.234 0.900 6.165 1 0.013*
Percentile 25th, HU 0.275 0.129 4.533 1 0.033*
Entropy GLCM 10 −16.428 7.024 5.470 1 0.019*
RDW, % 1.278 0.506 6.375 1 0.012*
LMR −0.580 0.294 3.889 1 0.049*
Predictive model −7.659 10.685 0.514 1 0.473
*P < 0.05 was considered statistically significant.
B indicates the estimated value of the regression coefficient given by the statistical software; df, degree of freedom; SE, standard error.

F5
FIGURE 5:
Receiver operating characteristic analysis to predict LN metastasis in lung adenocarcinomas. The values of AUCs for pleural thickening, percentile 25th, entropy GLCM 10, RDW, LMR, and predictive model were 0.635, 0.675, 0.721, 0.634, 0.653, and 0.929, respectively. The predictive model presented good performance in predicting LN metastasis than univariate parameters. PRE, prediction probability. Figure 5 can be viewed online in color at www.jcat.org.
TABLE 7 - The Diagnostic Performance of the Models in the Training and 2 Validation Cohorts
AUC
Model Logistic Regression SVM Naive Bayes Random Forest
Training cohort 0.929 0.767 0.777 0.734
Validation cohort 1 0.886 0.747 0.710 0.714
Validation cohort 2 0.871 0.879 0.702 0.842

F6
FIGURE 6:
Decision curve analysis for the multivariate models based on regression analysis in validation cohort 1 (A) and validation cohort 2 (B). The y-axis indicates the net benefit, and the x-axis indicates threshold probability. Compared with the simple diagnoses such as all LN metastasis in patients with lung adenocarcinomas (blue lines) or all patients without LN metastasis (black lines), the multivariate models (red lines) had the highest net benefit across the majority of the range of reasonable threshold probabilities at which a patient would be diagnosed as LN metastasis. Figure 6 can be viewed online in color at www.jcat.org.

Machine Learning Algorithm

Table 7 lists the values of AUCs for the 3 models based on machine learning algorithms. The greatest AUC in the training cohort model of 0.777 was obtained by using NB algorithm (Supplementary Table 2, https://links.lww.com/RCT/A136).

Interobserver Agreement

Among all the 41 CT continuous variables, 4 parameters of the interobserver agreements were good (0.643–0.796) and 29 of those were excellent (0.803–0.982; Table 8).

TABLE 8 - Interobserver Agreement of Quantitative CT and Texture Parameters
Variable ICC (95% CI) Variable ICC (95% CI)
CTmean 0.891 (0.859–0.916) CTmax 0.254 (0.122–0.377)
CTmin 0.625 (0.534–0.701) SD1 0.697 (0.619–0.761)
Long diameter 0.889 (0.856–0.914) Short diameter 0.889 (0.857–0.915)
Mean 0.952 (0.937–0.963) Histogram width 0.851 (0.809–0.885)
SD2 0.847 (0.804–0.882) Entropy GLCM 10 0.979 (0.972–0.984)
Max frequency 0.939 (0.921–0.954) Entropy GLCM 11 0.980 (0.973–0.985)
Mode 0.796 (0.740–0.841) Entropy GLCM 12 0.981 (0.975–0.985)
Minimum 0.803 (0.749–0.847) Entropy GLCM 13 0.979 (0.973–0.984)
Maximum 0.836 (0.790–0.873) Energy GLCM 10 0.901 (0.871–0.924)
Percentile 5th 0.875 (0.838–0.903) Energy GLCM 11 0.924 (0.902–0.942)
Percentile 10th 0.890 (0.858–0.915) Energy GLCM 12 0.897 (0.867–0.921)
Percentile 25th 0.927 (0.905–0.944) Energy GLCM 13 0.820 (0.770–0.860)
Percentile 50th 0.965 (0.955–0.973) Inertia GLCM 10 0.263 (0.132–0.385)
Percentile 75th 0.982 (0.976–0.986) Inertia GLCM 11 0.643 (0.556–0.717)
Percentile 90th 0.970 (0.961–0.977) Inertia GLCM 12 0.288 (0.158–0.408)
Skewness 0.537 (0.433–0.628) Inertia GLCM 13 0.651 (0.565–0.723)
Kurtosis 0.547 (0.443–0.635) Variance GLCM 10 0.342 (0.215–0.456)
Entropy 0.899 (0.870–0.923) Variance GLCM 11 0.973 (0.965–0.980)
Area 0.975 (0.968–0.981) Variance GLCM 12 0.412 (0.293–0.519)
Max diameter 0.937 (0.918–0.952) Variance GLCM 13 0.848 (0.805–0.883)
SsD low 0.851 (0.808–0.984)
SD1 indicates standard deviation in CT value quantitative parameters; SD2, standard deviation in texture parameters.

DISCUSSION

In this study, 207 lung adenocarcinoma cases were divided into training cohort and 2 validation cohorts. Qualitative CT, quantitative CT, texture, and hematological parameters were analyzed to predict LN metastasis. Parameters with significant differences (P < 0.05) in the univariate analysis were chosen as input parameters for the binary logistic regression analysis and machine learning algorithm and the prediction model was established. The results showed that the AUC values of the binary logistic regression models were 0.929, 0.886, and 0.871 in the 3 cohorts, respectively. The highest AUC value of the machine learning algorithm model was 0.777 in the training cohort using NB algorithm. The highest AUC values of the machine learning algorithm were 0.747 and 0.879, respectively, in the 2 validation cohorts by using SVM.

Among the qualitative CT parameters, pleural indentation, pleural thickening, attenuation, and air bronchogram were significantly different in the training cohort. Malignant lesions tend to cause pleural thickening and indentations close to the pleura.13,14 Malignant tumors are prone to LN metastasis. The risk of LN metastasis is greater in lung adenocarcinomas, which are diagnosed as solid lesions. This is probably because the blood supply to ground-glass opacity lesions is not as rich as that in the solid lesions.15 Lymph node metastasis was found in 21.3% of patients with air bronchogram and 41.9% without air bronchogram with lung adenocarcinomas. Hattori et al16 confirmed the significance of the presence of an air bronchogram in the lung adenocarcinoma as a predictor of LN-negative metastasis. However, Li et al13 reported that tumors with an air bronchogram were more common in the LN-positive metastasis group than in the LN-negative metastasis group. Further validation with larger sample size is needed to confirm these results.

Five quantitative CT value parameters were found to be statistically significant in the training cohort. The mean CT attenuation and minimum CT attenuation were higher in the LN-positive metastasis group. This might be because the lesions with more solid components have an abundant blood supply.15 The values of long diameter and short diameter were higher in the LN-positive metastasis group. The larger is the tumor size, the higher is the risk of LN metastasis.17

Among the texture parameters, 24 were statistically significant and mainly included the percentile, second-order entropy, and second-order energy series. The lower percentiles (5th–25th) are referred to as the ground-glass component.18 The higher is the value, the lower is the ground-glass component. Lesions with fewer ground-glass components are more likely to develop LN metastasis.19 This was consistent with our results of CT morphological assessment. In this study, the values of the second-order entropy GLCM 10-13 were higher in the LN-positive metastasis group relative to the LN-negative metastasis group. Entropy quantitatively features the heterogeneity of the tumor CT values.20,21 The higher is the heterogeneity, the more malignant is the tumor, thereby resulting in a higher risk of LN metastasis became. However, the values of the second-order energy GLCM 10-13 were lower in the LN-positive metastasis group than those in the LN-negative metastasis group. The energy features indicate the uniformity of gray-level voxel pairs.22 The more uniform is the tumor, the lower is its degree of malignancy, and the lower is the associated risk of LN metastasis.

Recently, it has become a common practice to add clinical information in radiological studies. In this study, we also incorporated hematological factors and radiological parameters to predict LN metastasis. The results showed that significant differences in the values of MONO, RDW, NLR, LMR, and PMR between patients with different LN statuses in the training cohort. The correlation between hematological factors and tumors needs confirmed further. Wang et al23 reported that a decreased LMR is considered to be associated with a worse prognosis of patients due to their important roles in the initiation and development of cancers. Our results showed that the LMR values were significantly lower in the LN-positive metastasis group. Thus, the results of these 2 studies were similar.

Statistically significant parameters, including 5 CT morphological characteristics, 5 CT value quantitative parameters, 24 texture parameters, and 5 hematological parameters, were subjected to binary logistic regression analysis in the training cohort. The results demonstrated that pleural thickening, percentile 25th, entropy GLCM 10, RDW, and LMR were independent risk factors associated with LN metastasis and were chosen further to establish a predictive model. Decision curve analysis indicated that multivariate models based on regression analysis were useful for predicting LN metastasis in lung adenocarcinomas, which suggested the net benefit of its clinical consequences according to the threshold probability.

The AUC of the predictive model was 0.929, which was higher than those in the previous studies, thereby leading to overfitting.18,24 Therefore, we also established models using machine learning algorithms to predict LN metastasis. Before model building, least absolute shrinkage and selection operator analysis was used for dimension reduction. The results showed that the highest AUC value of the machine-learning algorithm model was 0.777 in the training cohort by using NB algorithm. Generally, machine learning algorithm models require a larger sample size. The larger is the sample size, the higher is the efficiency of the model. Therefore, further studies with larger sample sizes should be performed. In addition, we analyzed the consistency of the included parameters. Four parameters of the interobserver agreements were good (0.643–0.796), and 29 of those were excellent (0.803–0.982).

However, our study had some limitations. First, the sample size was relatively small. This was a single-center study and external validation is lacking. Thus, larger sample sizes should be used, and multicenter cooperation in the future is necessary to validate these findings. Second, our study was retrospective in design, and patient inclusion bias was inevitable. Third, we did not evaluate the interobserver consistency in CT morphological characteristics. Finally, texture analysis was performed on the 2-dimensional images by selecting only the cross-section of the maximum slice. This contains little information and may not reflect the features of the entire tumor.

CONCLUSIONS

Multivariate models incorporating CT morphological characteristics, CT value quantitative parameters, texture, and hematological parameters using logistic regression and machine learning algorithms could predict LN metastasis in lung adenocarcinomas. These findings may provide a reference for clinical decision making.

REFERENCES

1. Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;7:209–249.
2. National Comprehensive Cancer Network (NCCN) clinical practice guidelines in oncology, in: non-small cell lung cancer, Version 4. 2021. Available at: https://www.nccn.org/store/login/login.aspx?ReturnURL=https://www.nccn.org/professionals/physician_gls/pdf/nscl.pdf. Accessed March 3, 2021.
3. Boffa DJ, Kosinski AS, Paul S, et al. Lymph node evaluation by open or video-assisted approaches in 11,500 anatomic lung cancer resections. Ann Thorac Surg. 2012;94:347–353.
4. Cornwell LD, Bakaeen FG, Lan CK, et al. Endobronchial ultrasonography-guided transbronchial needle aspiration biopsy for preoperative nodal staging of lung cancer in a veteran population. JAMA Surg. 2013;148:1024–1029.
5. Zhong Y, Yuan M, Zhang T, et al. Radiomics approach to prediction of occult mediastinal lymph node metastasis of lung adenocarcinoma. AJR Am J Roentgenol. 2018;211:109–113.
6. Kanzaki R, Higashiyama M, Fujiwara A, et al. Occult mediastinal lymph node metastasis in NSCLC patients diagnosed as clinical N0-1 by preoperative integrated FDG-PET/CT and CT: risk factors, pattern, and histopathological study. Lung Cancer. 2011;71:333–337.
7. Prenzel KL, Mönig SP, Sinning JM, et al. Lymph node size and metastatic infiltration in non-small cell lung cancer. Chest. 2003;123:463–467.
8. Sioris T, Järvenpää R, Kuukasjärvi P, et al. Comparison of computed tomography and systematic lymph node dissection in determining TNM and stage in non-small cell lung cancer. Eur J Cardiothorac Surg. 2003;23:403–408.
9. Zhao F, Zhou Y, Ge PF, et al. A prediction model for lymph node metastases using pathologic features in patients intraoperatively diagnosed as stage I non-small cell lung cancer. BMC Cancer. 2017;17:267.
10. Bayanati H, Thornhill RE, Souza CA, et al. Quantitative CT texture and shape analysis: can it differentiate benign and malignant mediastinal lymph nodes in patients with primary lung cancer?Eur Radiol. 2015;25:480–487.
11. Goksel S, Ozcelik N, Telatar G, et al. The role of hematological inflammatory biomarkers in the diagnosis of lung cancer and in predicting TNM stage. Cancer Invest. 2021;39:514–520.
12. Xu F, Xu P, Cui W, et al. Neutrophil-to-lymphocyte and platelet-to-lymphocyte ratios may aid in identifying patients with non-small cell lung cancer and predicting tumor-node-metastasis stages. Oncol Lett. 2018;16:483–490.
13. Li Q, He XQ, Fan X, et al. Development and validation of a combined model for preoperative prediction of lymph node metastasis in peripheral lung adenocarcinoma. Front Oncol. 2021;11:675877.
14. Hsu JS, Han IT, Tsai TH, et al. Pleural tags on CT scans to predict visceral pleural invasion of non-small cell lung cancer that does not abut the pleura. Radiology. 2016;279:590–596.
15. Yanagawa M, Tsubamoto M, Satoh Y, et al. Lung adenocarcinoma at CT with 0.25-mm section thickness and a 2048 matrix: high-spatial-resolution imaging for predicting invasiveness. Radiology. 2020;297:462–471.
16. Hattori A, Suzuki K, Maeyashiki T, et al. The presence of air bronchogram is a novel predictor of negative nodal involvement in radiologically pure-solid lung cancer. Eur J Cardiothorac Surg. 2014;45:699–702.
17. Cong M, Feng H, Ren JL, et al. Development of a predictive radiomics model for lymph node metastases in pre-surgical CT-based stage IA non-small cell lung cancer. Lung Cancer. 2020;139:73–79.
18. Gu Y, She Y, Xie D, et al. A texture analysis-based prediction model for lymph node metastasis in stage IA lung adenocarcinoma. Ann Thorac Surg. 2018;106:214–220.
19. Wu G, Woodruff HC, Shen J, et al. Diagnosis of invasive lung adenocarcinoma based on chest CT radiomic features of part-solid pulmonary nodules: a multicenter study. Radiology. 2020;297:451–458.
20. Zhu H, Xu Y, Liang N, et al. Assessment of clinical stage IA lung adenocarcinoma with pN1/N2 metastasis using CT quantitative texture analysis. Cancer Manag Res. 2020;12:6421–6430.
21. Wang X, Wu K, Li X, et al. Additional value of PET/CT-based radiomics to metabolic parameters in diagnosing lynch syndrome and predicting PD1 expression in endometrial carcinoma. Front Oncol. 2021;11:595430.
22. Song L, Yin J. Application of texture analysis based on sagittal fat-suppression and oblique axial T2-weighted magnetic resonance imaging to identify lymph node invasion status of rectal cancer. Front Oncol. 2020;10:1364.
23. Wang L, Dong T, Xin B, et al. Integrative nomogram of CT imaging, clinical, and hematological features for survival prediction of patients with locally advanced non-small cell lung cancer. Eur Radiol. 2019;29:2958–2967.
24. Andersen MB, Harders SW, Ganeshan B, et al. CT texture analysis can help differentiate between malignant and benign lymph nodes in the mediastinum in patients suspected for lung cancer. Acta Radiol. 2016;57:669–676.
Keywords:

lung adenocarcinoma; computed tomography; texture analysis; machine learning algorithm

Supplemental Digital Content

Copyright © 2022 The Author(s). Published by Wolters Kluwer Health, Inc.