Secondary Logo

Journal Logo

Development and internal validation of a novel model and markers to identify the candidates for lymph node metastasis in patients with prostate cancer

Cao, Hai-ming MSa; Wan, Zi MDb; Wu, Yu MSa; Wang, Hong-yang PhDc; Guan, Chao BSa,*

Section Editor(s): Severino., Patricia

doi: 10.1097/MD.0000000000016534
Research Article: Clinical Trial/Experimental Study
Open

Background High-grade prostate cancer (PCa) has a poor prognosis, and up to 15% of patients worldwide experience lymph node invasion (LNI). To further improve the prediction lymph node invasion in prostate cancer, we adopted risk scores of the genes expression based on the nomogram in guidelines.

Methods We analyzed clinical data from 320 PCa patients from the Cancer Genome Atlas database. Weighted gene coexpression network analysis was used to identify the genes that were significantly associated with LNI in PCa (n = 390). Analyses using the Gene Ontology and Kyoto Encyclopedia of Genes and Genomes databases were performed to identify the activated signaling pathways. Univariate and multivariate logistic regression analyses were performed to identify the independent risk factors for the presence of LNI.

Results We found that patients with actual LNI and predicted LNI had the worst survival outcomes. The 7 most significant genes (CTNNAL1, ENSA, MAP6D1, MBD4, PRCC, SF3B2, TREML1) were selected for further analysis. Pathways in the cell cycle, DNA replication, oocyte meiosis, and 9 other pathways were dramatically activated during LNI in PCa. Multivariate analyses identified that the risk score (odds ratio [OR] = 1.05 for 1% increase, 95% confidence interval [CI]: 1.04–1.07, P < .001), serum PSA level, clinical stage, primary biopsy Gleason grade (OR = 2.52 for a grade increase, 95% CI: 1.27–5.22, P = .096), and secondary biopsy Gleason grade were independent predictors of LNI. A nomogram built using these predictive variables showed good calibration and a net clinical benefit, with an area under the curve (AUC) value of 90.2%.

Conclusions In clinical practice, the application of our nomogram might contribute significantly to the selection of patients who are good candidates for surgery with extended pelvic lymph node dissection.

aDepartment of Urology, The Second Affiliation Hospital, Bengbu Medical College, Bengbu, Anhui

bDepartment of Urology, The First Affiliation Hospital, Sun Yat-Sen University, Guangzhou, Guangdong

cDepartment of Urology, The First Affiliation Hospital, Qingdao University, Qingdao, Shandong, China.

Correspondence: Chao Guan, The Second Affiliated Hospital of Bengbu Medical College, Bengbu, Anhui, China (e-mail: guanchao139@foxmail.com)

Abbreviations: AUC = area under the curve, DEG = differentially expressed genes, DFS = disease free survival, ePLND = extended pelvic lymph node dissection, GO = gene ontology, KEGG = Kyoto Encyclopedia of Genes and Genomes, LASSO = Least absolute shrinkage and selection operator, LNI = lymph node invasion, OS = overall survival, PRAD = Prostate adenocarcinoma, PCa = prostate cancer, PSA = prostate specific antigen, qRT-PCR = quantitative real time-PCR, RS = risk score, TCGA = The Cancer Genome Atlas, WGCNA = weighted gene coexpression network analysis.

H-mC and ZW have contributed equally to this work.

This study was funded by the Natural Science Fund of Bengbu Medical College (BYKY17115).

The authors have no conflicts of interest to disclose.

Received March 4, 2019

Received in revised form June 10, 2019

Accepted June 27, 2019

This is an open access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal. http://creativecommons.org/licenses/by-nc-nd/4.0

Back to Top | Article Outline

1 Introduction

High-grade prostate cancer (PCa) has a poor prognosis, and up to 15% of patients worldwide experience lymph node invasion (LNI).[1] Radical prostatectomy coupled with extended pelvic lymph node dissection (ePLND) remains the principal surgical procedure for these patients.[2] LNI is significant for the diagnosis of PCa, as it could not only predict the prognosis of patients but also play a decisive role in their surgical treatment. Therefore, predicting the occurrence of LNI is of vital clinical significance. At present, several nomograms can be used to predict the occurrence of LNI in PCa patients,[3–5] and they all show good predictive accuracy. However, these models are based on traditional biopsies or imaging-based diagnoses. Additionally, clinical imaging techniques have limited sensitivity for detecting LNI.[6,7] Hence, to overcome these drawbacks, a non-invasive and simple method to accurately predict LNI is urgently needed.

Published studies[8–10] have suggested that molecular biomarker analysis is a good method for predicting the prognostic outcome of PCa patients; it is also a promising and powerful method to predict the occurrence of LNI in PCa patients. Levels of gene expression[11–13] have been shown to be associated with LNI in PCa patients. By utilizing microarray data, we were able to use the method of applied weighted gene coexpression network analysis (WGCNA) to identify the hub genes. Thus, using the identified hub genes, we investigated an optimal gene signature to predict the occurrence of LNI in PCa patients. Recently, some studies[14–17] have demonstrated that the combination of several genes had the ability to predict lymph node metastasis in malignant tumors. However, the predictive ability of gene signatures in PCa is still unknown.

The primary aim of this study was to evaluate the LNI-predictive value of the gene signature of PCa and to develop a gene-based risk score for predicting LNI. In the current study, we examined the predictive ability of the base module combined with the risk score for predicting LNI in PCa patients. We also found that the risk score was significantly associated with poor prognosis in PCa patients.

Back to Top | Article Outline

2 Materials and methods

2.1 Patient population

The standardized level 3 RNA sequencing data of Prostate adenocarcinoma patients and the corresponding clinical records in The Cancer Genome Atlas (TCGA) were obtained from the FireBrowse (http://firebrowse.org). A total of 320 PCa patients’ clinical and pathologic data were obtained from the TCGA database (without missing values of gene and clinical information). LNI in patients was confirmed by pathology. A total of 390 patients’ genes data was complete. The RNAseq by Expectation-Maximization (RSEM) values were utilized to quantify the mRNA expression levels.

Back to Top | Article Outline

2.2 Weighted gene coexpression network analysis and functional enrichment analysis

A total of 390 patients’ genes data was complete. We find significant genes by WGCNA. Subsequently, the KEGG signaling pathways were coordinated by R package components such as ClusterProfiler, Pathview (http://www.bioconductor.org/) and Stringi (https://cran.r-project.org/). Cytoscape software (http://www.cytoscape.org/) was then used to convert the enriched analysis data into visual images.

Back to Top | Article Outline

2.3 Statistical methods

Continuously coded variables were reported as the mean, median, and interquartile range (IQR) and analyzed by t test. Categorical variables were reported as frequencies and proportions and analyzed by chi-square test. We identified signatures using least absolute shrinkage and selection operator (LASSO) regression. Univariable and multivariable logistic regression models were used to predict the occurrence of LNI.

The discrimination accuracy of multivariable models based on these variables in our cohort was quantified by the value of the area under the curve (corrected-AUC was calculated using a 200-resample bootstrap). The extent of over- and underestimation of pathologically confirmed versus nomogram-predicted LNI was graphically explored using a calibration plot. To determine the clinical net benefit associated with the use of the nomogram, we conducted a decision-curve analysis (DCA). The calibration and DCA were corrected for overfit using 10-fold cross validation.

Statistical analyses in the study were performed using the R statistical package v.3.3.2 (R Project for Statistical Computing). All statistical tests were 2-sided, and P < .05 was considered to be statistically significant.

Back to Top | Article Outline

3 Results

3.1 Clinicopathological characteristics of PCa patients

First, we analyzed clinical data from 320 PCa patients obtained from TCGA database. The clinicopathological characteristics of LNI patients and non-LNI patients are compared in Table 1. We found that the serum level of prostate specific antigen (PSA) was much higher in LNI patients than in non-LNI patients (15.89 vs 10.05 ng/mL, P < .001). In addition, LNI patients had higher tumor grade (including clinical stage, primary and secondary biopsy Gleason grade, pathologic stage, and pathologic Gleason score) than non-LNI patients (all P < .001).

Table 1

Table 1

Back to Top | Article Outline

3.2 Lymph node invasion is associated with the inferior outcome in PCa patients

Traditionally, LNI prediction models were commonly used to confirm the presence of LNI. However, LNI prediction models could not absolutely distinguish between LNI and non-LNI (Fig. 1A). Thus, it was very necessary to develop a better model to predict the presence of LNI.

Figure 1

Figure 1

Patients with actual LNI and predicted LNI had the worst survival outcomes, and patients with actual non-LNI or predicted non-LNI had the best survival outcomes. However, patients with actual non-LNI and predicted LNI or actual LNI and predicted non-LNI had significantly lower 5-year survival rates than patients with actual non-LNI and predicted non-LNI (Fig. 1B).

Back to Top | Article Outline

3.3 Weighted gene coexpression network analysis

WGCNA was built from 53,324 genes that were identified as being associated with LNI in 390 PCa, and 89 modules were identified. Figure 2 shows the visual representation of WGCNA of 22 modules. Then, we selected the 5 most significant modules based on correlation with the LNI and P values from the 22 modules for further analysis (Fig. 3).

Figure 2

Figure 2

Figure 3

Figure 3

Back to Top | Article Outline

3.4 Gene ontology and pathway analysis

A total of 2294 genes and 22 modules associated with LNI were identified in PCa. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses are presented in Fig. 4. KEGG analysis revealed that pathways involved in the cell cycle, DNA replication, Fanconi anemia, oocyte meiosis, progesterone-mediated oocyte maturation, base excision repair, p53 signaling, cellular senescence, measles, alcoholism, homologous recombination, and mismatch repair were dramatically activated in PCa with LNI (Fig. 4D).

Figure 4

Figure 4

Back to Top | Article Outline

3.5 Identification of the 7-gene signature and its association with the survival of PCa patients

We used 7 genes (CTNNAL1, ENSA, MAP6D1, MBD4, PRCC, SF3B2, TREML1) to generate a signature using LASSO regression from 904 genes in the 5 most significant modules. The risk score was calculated for each of the 390 patients from TCGA, and patients in every grade were then successfully divided into a high-gene expression group and a low-gene expression group based on a cutoff value (the median risk score). PCa patients in the high-gene expression group had significantly shorter disease-free survival (DFS) and overall survival (OS) than those of the low-gene expression group (all P < .05, Fig. 5A–G).

Figure 5

Figure 5

Additionally, we found that the 7 genes were significantly differentially overexpressed in LNI PCa compared with non-LNI PCa (all P < .001, Fig. 6).

Figure 6

Figure 6

Back to Top | Article Outline

3.6 Risk factors for lymph node invasion

Univariate and multivariate logistic regression analyses were performed to identify independent risk factors for the presence of LNI (Table 2). In the univariate analysis, the variable of risk score was the most accurate predictor (corrected AUC = 86.7%), followed by clinical stage (74.3%), primary biopsy Gleason grade (73.2%), serum PSA level (68.1%), and secondary biopsy Gleason grade (64.1%). In the multivariate analysis, we found that the risk score (OR = 1.05 for 1% increase, 95% CI: 1.04–1.07, P < .001), serum PSA level, clinical stage, primary biopsy Gleason grade (odds ratio [OR] = 2.52 for a grade increase, 95% confidence interval [CI]: 1.27–5.22, P = .096) and secondary biopsy Gleason grade were independent predictors of LNI. Based on the 5 predictors, we developed a full model for predicting LNI, and the corrected AUC value was 90.2%. Interestingly, when the variable of risk score was removed from the full model, namely, the base model that identified the 4 predictors (serum PSA level, clinical stage, primary Gleason grade, and secondary Gleason grade), the corrected AUC value dropped to 83.7%.

Table 2

Table 2

Then, a nomogram was developed (Fig. 7). The nomogram displayed the multivariable analysis effect of predictors on the risk of LNI. The calibration plot of the predicted probabilities against the observed LNI rates indicated good concordance (Fig. 8A). Additionally, the decision curve analysis demonstrated that the full model had the highest clinical net benefit across the entire range of threshold probabilities (Fig. 8B).

Figure 7

Figure 7

Figure 8

Figure 8

Back to Top | Article Outline

4 Discussion

Although ePLND represents the gold standard of treatment in LNI PCa,[2] given the increased and serious complications related to this procedure,[18] an ePLND should be considered only in men with very high risk of LNI. Therefore, accurate identification of these high-risk patients could greatly help to avoid unnecessary ePLND treatment. In the present study, on the basis of traditional predictive variables, we incorporated the risk scores of the gene signature and developed a novel nomogram to predict LNI in PCa patients. We showed that the nomogram has high accuracy in detecting LNI (AUC value = 90.2%); in addition, its calibration also has good concordance between predicted and observed LNI probabilities. The decision curve analysis demonstrated that the full model had the highest clinical net benefit across the entire range of threshold probabilities.

Currently, there are several nomograms[19–21] that predict the occurrence of LNI in PCa patients. The accuracy of the first nomogram was not high (AUC = 76%), and the latest nomogram had higher accuracy, but it was based on biopsy data. The previous prediction nomogram was based on detailed biopsy data obtained at the central pathologic review.[19] This highlights the need for simple and efficient methods that can detect LNI in PCa patients with increased accuracy. In our study, the nomogram was built on predictive variables, including a 7 gene signature-based risk score, serum PSA level, clinical stage, primary biopsy Gleason grade, and secondary biopsy Gleason grade. We showed that combining existing clinical variables with our newly developed gene signature-based risk scores could enhance the detection accuracy of LNI in PCa patients. We note that these predictive variables were all convenient parameters; they do not depend on preoperative biopsies or image-based examination. Hence, our nomogram is a non-invasive and simple method for predicting LNI.

Prior studies have noted the importance of gene signatures in the prognosis of PCa. Jin et al[22] demonstrated that an NF-kB-activated recurrence predictor 21 gene signature could predict disease-specific survival and distant metastasis-free survival in patients with PCa, although the study used molecular biological methods. Recently, another study[23] identified a 24-gene signature that was significantly associated with the development of metastasis and prostate cancer-specific mortality after radical prostatectomy. However, the importance of gene signatures in predicting LNI in PCa is not fully understood. In the present study, the predictive accuracy of the 7 genes to distinguish tissues with LNI and those without LNI was measured by ROC curve analysis. In the Cox analysis, gene signature-based risk scores were also the single most powerful predictor of LNI. The 7-gene signature could be of particular use in situations when predictions of the occurrence of LNI are ambiguous or borderline. Additionally, the 7-gene signature in our analysis could promote the initiation of additional therapies to treat LNI and allow for personalized treatment for patients.

In our study, by using bioinformatic analyses, we found that the 7 gene signature-based risk score was an important independent predictor of LNI. Clearly, we need to further[24] understand how the 7 key genes affect the development of LNI. A previous study reported that CTNNAL1 was associated with pelvic lymph node metastasis in early-stage cervical cancer; in addition, CTNNAL1 can downregulate E-cadherin and promote melanoma progression and invasion.[25] It has been demonstrated the MBD4 gene was associated with PCa progression, and MBD4 was upregulated in metastatic PCa samples when compared with the expression in primary tumors.[26] SF3B2, as one of the genes of the spliceosome pathway, was overexpressed in hepatocellular carcinoma,[27] but its specific role in PCa remains unknown. It has been reported that MAP6D1 was overexpressed in late stage clear cell renal cell carcinoma.[28] Based on the evidence described above, these genes appear to play important roles in the progression or metastasis of malignant tumors.

By applying novel analysis methods, we developed a new nomogram for predicting the occurrence of LNI in PCa patients. However, this study has several limitations that should be noted. First, our study had a small sample size, which was obtained from TCGA. Our study included 320 PCa patients, while previous nomograms included >500 patients each.[3,4] Additionally, although our nomogram has good concordance between predicted and observed LNI probabilities, it was not validated by an external validation cohort from another hospital. Further research should be undertaken to verify the predictive accuracy of our nomogram. Moreover, we identified the top 7 hub genes (TREML1, CTNNAL1, ENSA, MAP6D1, MBD4, PRCC, and SF3B2), which were closely related to LNI in PCa. However, their expression was not validated in PCa tissue samples by quantitative real-time PCR (qRT-PCR).

In summary, we have established that the risk score of the 7-gene signature was associated with a high risk of LNI in PCa patients. The present model improves the ability to identify patients at a high risk of LNI, and it could provide a practical guide for clinicians to more accurately identify patients who require surgery with ePLNDs. In clinical practice, the application of our nomogram might contribute significantly to the selection of patient candidates for surgery with ePLND.

Back to Top | Article Outline

Acknowledgments

The authors would like to acknowledge the help provided by Zebin Zhu (The First Affiliated Hospital, University of Science and Technology of China), Jianming Zeng (CEO of biotrainee.com), and Qiannan Yang (his wife of HMC).

Back to Top | Article Outline

Author contributions

Conceptualization: Zi Wan, Chao Guan.

Data curation: Haiming Cao , Zi Wan, Hongyang Wang.

Formal analysis: Haiming Cao, Zi Wan.

Funding acquisition: Haiming Cao.

Investigation: Yu Wu.

Methodology: Haiming Cao, Yu Wu.

Project administration: Chao Guan.

Software: Haiming Cao.

Supervision: Hongyang Wang.

Validation: Haiming Cao.

Visualization: Haiming Cao.

Writing – original draft: Haiming Cao.

Writing – review & editing: Zi Wan, Chao Guan.

Back to Top | Article Outline

References

[1]. Wilczak W, Wittmer C, Clauditz T, et al. Marked prognostic impact of minimal lymphatic tumor spread in prostate cancer. Eur Urol 2018;74:376–86.
[2]. Mottet N, Bellmunt J, Bolla M, et al. EAU-ESTRO-SIOG guidelines on prostate cancer. Part 1: Screening, diagnosis, and local treatment with curative intent. Eur Urol 2017;71:618–29.
[3]. Briganti A, Larcher A, Abdollah F, et al. Updated nomogram predicting lymph node invasion in patients with prostate cancer undergoing extended pelvic lymph node dissection: the essential importance of percentage of positive cores. Eur Urol 2012;61:480–7.
[4]. Gandaglia G, Fossati N, Zaffuto E, et al. Development and internal validation of a novel model to identify the candidates for extended pelvic lymph node dissection in prostate cancer. Eur Urol 2017;72:632–40.
[5]. Gandaglia G, Ploussard G, Valerio M, et al. A novel nomogram to identify candidates for extended pelvic lymph node dissection among patients with clinically localized prostate cancer diagnosed with magnetic resonance imaging-targeted and systematic biopsies. Eur Urol 2018;75:506–14.
[6]. Hovels AM, Heesakkers RA, Adang EM, et al. The diagnostic accuracy of CT and MRI in the staging of pelvic lymph nodes in patients with prostate cancer: a meta-analysis. Clin Radiol 2008;63:387–95.
[7]. von Eyben FE, Kairemo K. Meta-analysis of (11)C-choline and (18)F-choline PET/CT for management of patients with prostate cancer. Nucl Med Commun 2014;35:221–30.
[8]. Benzon B, Zhao SG, Haffner MC, et al. Correlation of B7-H3 with androgen receptor, immune pathways and poor outcome in prostate cancer: an expression-based analysis. Prostate Cancer Prostatic Dis 2017;20:28–35.
[9]. Cooperberg MR, Erho N, Chan JM, et al. The diverse genomic landscape of clinically low-risk prostate cancer. Eur Urol 2018;74:444–52.
[10]. Walker SM, Knight LA, McCavigan AM, et al. Molecular subgroup of primary prostate cancer presenting with metastatic biology. Eur Urol 2017;72:509–18.
[11]. Chen C, Cai Q, He W, et al. AP4 modulated by the PI3K/AKT pathway promotes prostate cancer proliferation and metastasis of prostate cancer via upregulating L-plastin. Cell Death Dis 2017;8:e3060.
[12]. Lu X, Pan X, Wu CJ, et al. An in vivo screen identifies PYGO2 as a driver for metastatic prostate cancer. Cancer Res 2018;78:3823–33.
[13]. Oh JJ, Park S, Lee SE, et al. A clinicogenetic model to predict lymph node invasion by use of genome-based biomarkers from exome arrays in prostate cancer patients. Korean J Urol 2015;56:109–16.
[14]. Cai W, Li Y, Huang B, et al. Esophageal cancer lymph node metastasis-associated gene signature optimizes overall survival prediction of esophageal cancer. J Cell Biochem 2019;120:592–600.
[15]. Chen X, Wang YW, Zhu WJ, et al. A 4-microRNA signature predicts lymph node metastasis and prognosis in breast cancer. Hum Pathol 2018;76:122–32.
[16]. Kang S, Thompson Z, McClung EC, et al. Gene expression signature-based prediction of lymph node metastasis in patients with endometrioid endometrial cancer. Int J Gynecol Cancer 2018;28:260–6.
[17]. Sonohara F, Gao F, Iwata N, et al. Genome-wide discovery of a novel gene-expression signature for the identification of lymph node metastasis in esophageal squamous cell carcinoma. Ann Surg 2017;269:879–86.
[18]. Ploussard G, Briganti A, de la Taille A, et al. Pelvic lymph node dissection during robot-assisted radical prostatectomy: efficacy, limitations, and complications-a systematic review of the literature. Eur Urol 2014;65:7–16.
[19]. Briganti A, Chun FK, Salonia A, et al. Validation of a nomogram predicting the probability of lymph node invasion among patients undergoing radical prostatectomy and an extended pelvic lymphadenectomy. Eur Urol 2006;49:1019–26. discussion 1026-1027.
[20]. Briganti A, Karakiewicz PI, Chun FK, et al. Percentage of positive biopsy cores can improve the ability to predict lymph node invasion in patients undergoing radical prostatectomy and extended pelvic lymph node dissection. Eur Urol 2007;51:1573–81.
[21]. Godoy G, Chong KT, Cronin A, et al. Extent of pelvic lymph node dissection and the impact of standard template dissection on nomogram prediction of lymph node involvement. Eur Urol 2011;60:195–201.
[22]. Jin R, Yi Y, Yull FE, et al. NF-kappaB gene signature predicts prostate cancer progression. Cancer Res 2014;74:2763–72.
[23]. Pellegrini KL, Sanda MG, Patil D, et al. Evaluation of a 24-gene signature for prognosis of metastatic events and prostate cancer-specific mortality. BJU Int 2017;119:961–7.
[24]. Noordhuis MG, Fehrmann RS, Wisman GB, et al. Involvement of the TGF-beta and beta-catenin pathways in pelvic lymph node metastasis in early-stage cervical cancer. Clin Cancer Res 2011;17:1317–30.
[25]. Kreiseder B, Orel L, Bujnow C, et al. alpha-Catulin downregulates E-cadherin and promotes melanoma progression and invasion. Int J Cancer 2013;132:521–30.
[26]. Bianco-Miotto T, Chiam K, Buchanan G, et al. Global levels of specific histone modifications and an epigenetic gene signature predict prostate cancer progression and development. Cancer Epidemiol Biomarkers Prev 2010;19:2611–22.
[27]. Xu W, Huang H, Yu L, et al. Meta-analysis of gene expression profiles indicates genes in spliceosome pathway are up-regulated in hepatocellular carcinoma (HCC). Med Oncol 2015;32:96.
[28]. Bhalla S, Chaudhary K, Kumar R, et al. Gene expression-based biomarkers for discriminating early and late stage of clear cell renal cancer. Sci Rep 2017;7:44997.
Keywords:

gene signature; lymph node invasion; nomogram; prostate cancer

Copyright © 2019 The Authors. Published by Wolters Kluwer Health, Inc. All rights reserved.