Secondary Logo

Journal Logo

A Machine-Learning Approach Using PET-Based Radiomics to Predict the Histological Subtypes of Lung Cancer

Hyun, Seung Hyup MD, PhD*; Ahn, Mi Sun MD; Koh, Young Wha MD, PhD; Lee, Su Jin MD, PhD§

Clinical Nuclear Medicine: December 2019 - Volume 44 - Issue 12 - p 956–960
doi: 10.1097/RLU.0000000000002810
Original Articles
Free

Purpose We sought to distinguish lung adenocarcinoma (ADC) from squamous cell carcinoma using a machine-learning algorithm with PET-based radiomic features.

Methods A total of 396 patients with 210 ADCs and 186 squamous cell carcinomas who underwent FDG PET/CT prior to treatment were retrospectively analyzed. Four clinical features (age, sex, tumor size, and smoking status) and 40 radiomic features were investigated in terms of lung ADC subtype prediction. Radiomic features were extracted from the PET images of segmented tumors using the LIFEx package. The clinical and radiomic features were ranked, and a subset of useful features was selected based on Gini coefficient scores in terms of associations with histological class. The areas under the receiver operating characteristic curves (AUCs) of classifications afforded by several machine-learning algorithms (random forest, neural network, naive Bayes, logistic regression, and a support vector machine) were compared and validated via random sampling.

Results We developed and validated a PET-based radiomic model predicting the histological subtypes of lung cancer. Sex, SUVmax, gray-level zone length nonuniformity, gray-level nonuniformity for zone, and total lesion glycolysis were the 5 best predictors of lung ADC. The logistic regression model outperformed all other classifiers (AUC = 0.859, accuracy = 0.769, F1 score = 0.774, precision = 0.804, recall = 0.746) followed by the neural network model (AUC = 0.854, accuracy = 0.772, F1 score = 0.777, precision = 0.807, recall = 0.750).

Conclusions A machine-learning approach successfully identified the histological subtypes of lung cancer. A PET-based radiomic features may help clinicians improve the histopathologic diagnosis in a noninvasive manner.

From the *Department of Nuclear Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul

Departments of Hematology-Oncology

Pathology

§Nuclear Medicine, Ajou University School of Medicine, Suwon, Republic of Korea.

Received for publication June 5, 2019; revision accepted August 3, 2019.

Conflicts of interest and sources of funding: This work was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (NRF-2016R1C1B2011583). None declared to all authors.

Correspondence to: Su Jin Lee, MD, PhD, Department of Nuclear Medicine, Ajou University School of Medicine, 164, Worldcup-ro, Yeongtong-gu, Suwon 16499, Republic of Korea. E-mail: suesj202@ajou.ac.kr.

Online date: October 31, 2019

Non–small cell lung cancer (NSCLC) is a heterogeneous group of diseases that includes adenocarcinoma (ADC), squamous cell carcinoma (SCC), adenosquamous cell carcinoma, large cell carcinoma, and sarcomatoid carcinoma. The most common histological subtypes are ADC and SCC. As the treatment-related outcomes of and the chemotherapeutic regimens for lung ADC and SCC differ, histopathologic diagnosis is essential prior to treatment initiation. CT-guided needle biopsy is the criterion standard for histological classification, but it has several limitations: it is invasive, cannot provide spatial information, typically cannot be repeated, does not allow for whole-body assessment, and can cause complications.1 Whole-tumor radiomics can characterize lesions noninvasively and repeatedly. This technique has been termed “virtual biopsy.”2 “Radiomics” refers to the extraction of many quantitative features such as pixel intensity, shape, and texture; these are used to convert standard clinical imaging data to higher-dimensional mineable data.3 High-throughput radiomics has recently emerged as a powerful approach for identifying of imaging biomarkers that can be used to build decision-support systems for cancer. The recent explosion of medical imaging data has created a fertile environment for machine learning and medical informatics.4,5

18F-FDG PET/CT has been widely used for staging NSCLC. Several studies have reported that glucose metabolism differs between ADC and SCC.6–9 Also, radiomic approaches using FDG PET/CT data have identified differences in the textural features of ADC and SCC.1,10,11 Therefore, we hypothesized that machine-learning approach could be used to differentiate ADC from SCC, and we explored whether a machine-learning algorithm using PET-based radiomic features could distinguish lung ADC from SCC.

Back to Top | Article Outline

PATIENTS AND METHODS

Patients

We retrospectively reviewed the pretreatment FDG PET/CT scans of 556 consecutive NSCLC patients taken between January 2013 and December 2016 at a single institution. We subsequently evaluated 509 patients after excluding 47 histological subtypes other than ADC or SCC. An additional 113 patients (102 with ADC and 11 with SCC) were excluded because tumor size was too small for accurate texture analysis (see Measurement of PET Parameters for technical details). The final patient cohort was 369 patients. Histological subclassification was performed based on the 2015 World Health Organization Classification of Lung Tumors.12 This retrospective study was approved by the ethics committee of our institution; the requirement for informed consent was waived given the retrospective nature of the work.

Back to Top | Article Outline

FDG PET/CT Acquisition

FDG PET/CT was performed using a Discovery ST or STE PET/CT scanner (GE Healthcare, Milwaukee, Wis). All patients fasted for at least 6 hours before FDG PET/CT; their blood glucose levels at the time of FDG injection were less than 150 mg/dL. Unenhanced CT was performed 60 minutes after the injection of 5 MBq/kg of FDG using a 16-slice helical CT scanner (120 keV; 30–100 mA in the AutomA mode; section width 3.75 mm). Emission PET data were acquired from the thigh to the head for 3.0 minutes per frame in 3-dimensional mode. Attenuation-corrected PET images (CT data were used for correction) were reconstructed using an ordered-subset expectation maximization algorithm (20 subsets, 2 iterations).

Back to Top | Article Outline

Measurement of PET Parameters

Various PET parameters of primary lung lesions were measured. A fixed SUV of 2.5 was used to define the boundaries of volumes of interest (VOIs). The LIFEx package (version 4.00)13 was used to extract 40 radiomic features on PET images (Table 1). LIFEx calculates textural features only for VOIs of at least 64 voxels. The PET VOI did not attain the minimum number of 64 voxels in 102 patients with ADC and 11 patients with SCC because of poor image matrix resolution; finally, 396 patients were evaluated.

TABLE 1

TABLE 1

Back to Top | Article Outline

Machine Learning Approach and Statistical Analyses

A total of 4 clinical and 40 radiomic features were used to predict tumor histological subtype employing machine-learning approaches. The classification target was the histological subtype of ADC. The clinical features considered included age, sex, tumor size, and smoking status. It was necessary to employ a feature reduction procedure when selecting a subset of features increasing predictive accuracy. A ranking-based feature selection method with the Gini coefficient14 was used to reduce feature dimensions. The Gini coefficient is a measure of inequality of a distribution. It is defined as a ratio with values between 0 and 1, where 0 denotes that all elements belong to a certain class or if there exists only 1 class, and 1 denotes that the elements are randomly distributed across various classes. Clinical and radiomic features were ranked based on the Gini coefficient score derived by evaluating their associations with histological class. To identify the optimal feature selection size, nine feature subsets were selected; the selection size ranged from 5 to 44 in steps of 5.

Five different machine-learning algorithms for binary classification were evaluated: a random forest, a neural network, a naive Bayes method, logistic regression, and a support vector machine. Model performance was internally validated via random sampling; data were randomly split into training and testing sets, and the entire procedure was repeated 100 times (training set size 70%). To compare the predictive performances of the models and feature subsets, we drew receiver operating characteristic curves and measured the areas under the curve (AUCs). We computed the following performance measures: AUC, accuracy, F1 score, precision (also called positive predictive value), and recall (also known as sensitivity). The F1 score (also known as F score or F measure) is the harmonic average of precision and recall.

The machine-learning approach was performed using Orange version 3.16 software (Bioinformatics Laboratory at the University of Ljubljana, Slovenia), an open-source data-mining and visualization package.15

We show means ± SDs of continuous variables and percentages of categorical variables. Differences between the 2 groups were compared using the t test for continuous variables and the χ2 test for dichotomous variables. All tests were 2-sided. Confidence intervals are reported at the 95% level, and P < 0.05 was considered to reflect statistical significance.

Back to Top | Article Outline

RESULTS

Patient Characteristics

The clinical characteristics of the 396 patients are summarized in Table 2. The PET radiomics cohort consisted of 288 males and 108 females aged 67.3 ± 10.5 years (range, 23–89 years). Of these, 210 had ADC and 186 had SCC. A total of 205 patients had a smoking history. The tumor size measured by CT was 4.7 ± 2.2 cm (range, 1.2–14.2 cm). Patients with SCCs had larger tumors than patients with ADCs (5.3 ± 2.4 vs 4.2 ± 1.8 cm, P < 0.001).

TABLE 2

TABLE 2

Back to Top | Article Outline

Radiomic Feature Selection and Receiver Operating Characteristic Curve Analysis

Clinical and radiomic features were ranked using the Gini coefficient (Table 3). Sex, SUVmax, gray-level zone length nonuniformity (GLZLM_ZLNU), gray-level nonuniformity for zone (GLZLM_GLNU), and total lesion glycolysis (TLG) were the 5 best predictors of tumor histological subtype. Figure 1 shows the differences in the radiomics features of ADC and SCC. SUVmax (14.2 ± 6.1 vs 9.6 ± 4.5, P < 0.001), TLG (604.4 ± 809.7 vs 274.1 ± 499.0, P < 0.001), GLZLM_ZLNU (51.2 ± 77.7 vs 29.6 ± 59.7, P = 0.002), and GLZLM_GLNU (13.5 ± 14.8 vs 6.9 ± 6.8, P < 0.001) were significantly higher for SCCs than ADCs. The overall classification performances of the 5 machine-learning methods were compared by calculating the AUCs of 9 feature subsets with 5, 10, 15, 20, 25, 30, 35, 40, and 44 features (Table 4, Fig. 2). The logistic regression model outperformed the other classifiers when 15 features subset was used (AUC = 0.859, accuracy = 0.769, F1 score = 0.774, precision = 0.804, recall = 0.746), followed by the neural network model (AUC = 0.854, accuracy = 0.772, F1 score = 0.777, precision = 0.807, recall = 0.750).

TABLE 3

TABLE 3

FIGURE 1

FIGURE 1

TABLE 4

TABLE 4

FIGURE 2

FIGURE 2

Back to Top | Article Outline

DISCUSSION

We developed and validated a PET-based radiomics model for the prediction of NSCLC histological subtype. The logistic regression model effectively differentiated ADC from SCC. Sex, SUVmax, gray-level zone length nonuniformity, gray-level nonuniformity for zone, and TLG were highly associated with tumor histological subtype.

Although ADC and SCC are categorized as NSCLC, their biological features, clinical characteristics, and treatment-related outcomes differ, allowing machine learning to distinguish the subtypes. Of the clinical characteristics, female sex best correlated with the ADC subtype. Sex differences in NSCLC have been widely reported.16–18 We previously showed that textural features differed by sex and histological subtype.10 We here confirm that sex is important when comparing ADC and SCC; more females than males had ADCs.

We found that the SUVmax and TLG were among the top 5 features correlating with histological subtypes of lung cancer. The glycolytic phenotypes of lung ADC and SCC differ. Schuurbiers et al8 suggested that ADCs engage in glycolysis under normoxic conditions, whereas SCCs experience diffusion-limited hypoxia, resulting in a very high anaerobic glycolytic rate. Our previous PET study showed that SCCs have considerably higher metabolic rates than ADCs.6

PET-based radiomics analysis assesses the textural features of entire tumors noninvasively. Several studies have shown that ADCs and SCCs differ in terms of PET textural features.1,10,11 However, no single feature adequately describes the pathological phenotype because tumors exhibit multiple tissue patterns. Thus, a combination of different textural features (the radiomics signature) is needed to describe the lesion. We used machine-learning approaches to select radiomics features distinguishing ADCs from SCCs; the diagnostic performances were promising.

Our study has several limitations. First, the study population was relatively small. Although we initially evaluated 509 patients, radiomic features were extracted for only 396. Many ADC cases exhibiting faint FDG uptake could not be subjected to textural analysis. The tumor lesions too small for texture analysis were also excluded. With increased use of lung cancer screening, the many small lesions are more likely to be discovered in the early stage. Machine-learning tools that can accommodate smaller size would therefore be an important direction for future research. Second, the lack of external validation and the retrospective nature of the work limit the generalizability of our results. Although an internal validation was performed, external validation is necessary using a larger cohort.

In conclusion, a machine-learning approach with PET-based radiomics successfully identified the histological subtypes of lung cancer. A PET-based radiomic features may help clinicians improve the histopathologic diagnosis of lung cancer in a noninvasive manner.

Back to Top | Article Outline

REFERENCES

1. Kirienko M, Cozzi L, Rossi A, et al. Ability of FDG PET and CT radiomics features to differentiate between primary and metastatic lung lesions. Eur J Nucl Med Mol Imaging. 2018;45:1649–1660.
2. Lambin P, Leijenaar RTH, Deist TM, et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol. 2017;14:749–762.
3. Gillies RJ, Kinahan PE, Hricak H. Radiomics: images are more than pictures, they are data. Radiology. 2016;278:563–577.
4. Choi H. Deep learning in nuclear medicine and molecular imaging: current perspectives and future directions. Nucl Med Mol Imaging. 2018;52:109–118.
5. Lee JG, Jun S, Cho YW, et al. Deep learning in medical imaging: general overview. Korean J Radiol. 2017;18:570–584.
6. Koh YW, Lee SJ, Park SY. Differential expression and prognostic significance of GLUT1 according to histologic type of non–small-cell lung cancer and its association with volume-dependent parameters. Lung Cancer. 2017;104:31–37.
7. Kim DH, Jung JH, Son SH, et al. Prognostic significance of intratumoral metabolic heterogeneity on 18F-FDG PET/CT in pathological N0 non–small cell lung cancer. Clin Nucl Med. 2015;40:708–714.
8. Schuurbiers OC, Meijer TW, Kaanders JH, et al. Glucose metabolism in NSCLC is histology-specific and diverges the prognostic potential of 18FDG-PET for adenocarcinoma and squamous cell carcinoma. J Thorac Oncol. 2014;9:1485–1493.
9. Meijer TW, Schuurbiers OC, Kaanders JH, et al. Differences in metabolism between adeno- and squamous cell non–small cell lung carcinomas: spatial distribution and prognostic value of GLUT1 and MCT4. Lung Cancer. 2012;76:316–323.
10. Koh YW, Lee D, Lee SJ. Intratumoral heterogeneity as measured using the tumor-stroma ratio and PET texture analyses in females with lung adenocarcinomas differs from that of males with lung adenocarcinomas or squamous cell carcinomas. Medicine (Baltimore). 2019;98:e14876.
11. Ha S, Choi H, Cheon GJ, et al. Autoclustering of non–small cell lung carcinoma subtypes on (18)F-FDG PET using texture analysis: a preliminary result. Nucl Med Mol Imaging. 2014;48:278–286.
12. Travis WD, Brambilla E, Nicholson AG, et al. The 2015 World Health Organization Classification of Lung Tumors: impact of genetic, clinical and radiologic advances since the 2004 classification. J Thorac Oncol. 2015;10:1243–1260.
13. Nioche C, Orlhac F, Boughdad S, et al. LIFEx: a freeware for radiomic feature calculation in multimodality imaging to accelerate advances in the characterization of tumor heterogeneity. Cancer Res. 2018;78:4786–4789.
14. Gini C. Concentration and dependency ratios. Riv Pol Econ. 1997;87:769–792.
15. Demšar J, Curk T, Erjavec A, et al. Orange: data mining toolbox in Python. J Mach Learn Res. 2013;14:2349–2353.
16. Ben Aissa A, Mach N. Is lung cancer in women different? [in French]. Rev Med Suisse. 2012;8:1108–1111.
17. Donington JS, Colson YL. Sex and gender differences in non–small cell lung cancer. Semin Thorac Cardiovasc Surg. 2011;23:137–145.
18. Paggi MG, Vona R, Abbruzzese C, et al. Gender-related disparities in non–small cell lung cancer. Cancer Lett. 2010;298:1–8.
Keywords:

adenocarcinoma; machine learning; non–small cell lung cancer; PET; texture analysis

Copyright © 2019 Wolters Kluwer Health, Inc. All rights reserved.