3.2 Integrative miRNA and mRNA model in TCGA ccRCC
With LASSO combined multivariate Cox regression, 6 RNAs were obtained, and the result is listed in Table 3. Of these 6 RNAs, 3 mRNA and 1 miRNA were protective RNAS (HRs < 1) and the other 2 mRNAs were risky RNAs (HRs > 1), and the coefficient of multivariate Cox regression is applied to calculate PI for ccRCC.
As a linear combination of the expression values of 6 RNAs, the PI was significantly associated with OS in ccRCC [HR = 7.13, 95% confidence interval (CI) = 3.71–13.70, P < .001]. The HR of PI was greater than HRs of grade (HR = 2.20, 95% CI = 1.18–4.08, P = .012) or T stage (HR = 2.68, 95% CI = 1.62–4.41, P < .001). The patients with ccRCC were ranked by PI value (Fig. 1A). The median of PI value as threshold can classify patients into high-risk group and low-risk group. The result showed that the gene signature can significantly classify survival time of ccRCC patients (Fig. 1B). The survival time of high-risk group is significant shorter than low-risk by log-rank test (P < .001). The identified RNA expressions in high-risk and low-risk are listed in Figure 1C, and the value of AUC = 0.748 (3 years) demonstrated that the model performed well in predicting prognosis of ccRCC (Fig. 1D).
3.3 Validating the result in independent dataset
For validation of the result, the GSE22541 dataset is employed to be an independent data to test above result. The dataset contains 24 primary ccRCC tumor and disease-free survival (DFS) time of patients. Although this data set does not have OS time, DFS data can also reflect patient outcomes. We just employed mRNA data to validate the results due to lack of microRNA data. The validation result is shown in Figure 2.
From Figure 2, we find that 5 mRNAs can significantly classify 2 groups into high-risk and low-risk (P = .03). The PI was significantly associated with DFS in independent data of ccRCC (HR = 2.77, 95% CI = 1.07–7.71). The value of AUC = 0.762 (3 years) also indicated that the model performed well. Above results demonstrated that the integrative model could effectively classify patients.
3.4 Other gene signature of ccRCC performance in 4 datasets
For further validation the result, we tested the model in other ccRCC data in TCGA. Moreover, we also validate the other 3 models in 4 data sets. These 3 gene signature models that were published previously are listed in Table 4.
For estimating the performance of various gene signature models, 3 indicators (HR, C-index, and AUC) of prognostic models need to be calculated. These 3 indicators were analyzed correlation in each other. Thus, the relationship of these 3 indicators in 4 models was assayed (Fig. 3).
The 3 indicators of Boguslawska model showed the strongest collinearity (Fig. 3D). The collinearity represented the model has good generalization ability. In addition, 1 value of HR in our study is missing. The integrate model from our study showed null value in TCGA_GA450 dataset. Because low-risk group that classified by integrate model has no end event occurs.
For testing performance of gene signature models, the box plot was employed to test the variation among indicators. Therefore, we consider 3 indicators (HR, C-index, and AUC) to evaluate the effect of all models in the 4 datasets (Fig. 4). These 3 indicators usually indicate the capacity of model prediction and high level of these indicators represents better performance of the model. The box plot also indicated the dispersion of gene signature models in different data sets. The results showed that the integrate model from our work had higher HR, C-index, and AUC among all datasets.
3.5 Gene Ontology enrichment of 4 gene signature models
The results in Figure 4 show that 3 indicators of the integrated model were higher than those of other models. Thus, we try to analyze the GO enrichment and pathways in which these models involved in. Of these gene signature models, the number of genes in a gene signature model is so small that it is difficult to enrich in GO analysis. Therefore, TF of these genes in gene signature was involved in pathway analysis. The regulation network of TF and genes was constructed by method section (Fig. 5). The integrate gene signature model from our work showed that 4 genes were regulated by 13 TFs (Fig. 5A). The width of lines represented the weighted of regulation by correlation coefficient of their expression level. Regulation network of other gene signature models are listed in Figure 5B, C, and D, respectively. The results showed that these genes shared some common TFs such as STAT4, ETS1, and FOXP3.
For further investigating the GO enrichment and pathway of these genes and TFs, ClusterProfiler package was employed to analyze 4 models. The above package can compare the results of biological process, cellular component, molecular function, and KEGG pathway in 4 models (Fig. 6 ). The results of biological process suggested that the 4 gene signature models share many similar processes (Fig. 6 A). The molecular function of these gene signature models showed that integrate model was similar to model of Boguslawska. And model of Zhan was similar to model of Yao (Fig. 6 B). The molecular function enrichment showed that 4 models were very similar (Fig. 6 C). In KEGG pathway, the comparison results showed that the integrative model is involved in more cancer-associated pathways (Fig. 6 D). The model of Boguslawska et al showed very complex and mainly involved in many signaling pathways associated with cancer. Although these gene signature models and TFs are very different, the biological process and pathways were very similar.
Our present study combined LASSO and multivariate Cox regression to calculate a prognostic gene signature model from integrative microRNA and mRNA expression of TCGA dataset. The other platform of TCGA and GEO dataset as validation datasets were employed to validate the results. Previous study has provided many biomarkers for predicting prognosis of ccRCC. In this study, we proposed 5 mRNAs and 1 microRNA (INTS8, GTPBP2, ANK3, SLC16A12, LIMCH1, and hsa-mir-374a) as robust gene signature model that could effectively predict the prognosis for ccRCC. In addition, we also found a regulation pair of hsa-mir-374a and ANK3 from TargetScan.
Of these genes, INTS8, ANK3, and LIMCH1 indicated that they are associated with renal cancer by previous publication.[36–38] To the best of our knowledge, we did not find the GTPBP2 and SLC16A12 associated with kidney cancer. Although the gene hsa-mir-374a is associated with cancer in many reports, there is no study on ccRCC. Previous studies have shown that hsa-mir-374a (HR = 0.64, 95% CI: 0.48–0.86) can reduce the risk of colorectal cancer. Our findings in kidney cancer also showed similar results (HR = 0.51, 95% CI: 0.29–0.89), so we hypothesized that hsa-mir-374a could reduce the risk of death. These 6-gene signatures showed robust ability in predicting prognosis of ccRCC.
Generally, gene signature prediction for prognosis mainly derived from Cox regression. However, different data preprocess and steps for Cox regression might lead to different results. This study combined the genes with differential expression, univariate Cox regression, LASSO, and multivariate Cox regression method to obtain gene signature for prognosis of ccRCC. In addition, 3 indictors including HR, C-index, and value of AUC were employed to estimate all models in systems level (Fig. 5). The results showed that the integrate model had more advantages than others.
Although our results show more advantages, it does not mean that other models are not good. Among the various gene signature models previously proposed, prognosis is thought to be predictive. In fact, different gene signature has similar pathway and its own special function. The similar pathways are possible to perform similar functions that affect prognosis. The different pathways may represent the heterogeneity of ccRCC.
In the work, the integrate model mainly involved in viral infection and inflammatory bowel disease (IBD)-related pathways. From literature review, there are few reports about viral infection associated with ccRCC. However, there are many reports about the relationship between IBD and renal cancer[40,41]. Although this work could not reveal the relationship between IBD and prognosis of ccRCC, the result might provide a new insight for further study about the ccRCC.
In addition, the gene expression data and clinical data of available ccRCC are very limited, which results in difficulty to further verify. We just used different platforms of TCGA dataset and GEO dataset as independent datasets for training and validation. For further validation of different, we test other 3 gene signature models (from Cox regression method) in different datasets. Moreover, the integrate model indeed showed greater stability and versatility in the TCGA and GSE22541 datasets.
Despite the limited data available, the data we obtained may have bias. However, the gene markers obtained by LASSO coupled multivariate Cox regression are indeed more stable in various public databases. In this study, we propose the optimization steps for analyzing gene prognostic markers by Cox regression. In addition, when gene markers are too scarce to enrich their functions by GO analysis, we can further analyze GO functional enrichment by predicting their TFs. We expect to find more and more stable genetic markers by this way to provide a more scientific reference for drug development and clinical decision-making.
The publications retrieval from staff in Evidence Based Medicine Center is appreciated by the authors.
Conceptualization: Peng Chang, Kehu Yang.
Data curation: Jingyun Zhang.
Formal analysis: Peng Chang, Juan Ling.
Funding acquisition: Zhitong Bing.
Investigation: Jinhui Tian, Xiuxia Li, Yumin Li.
Methodology: Peng Chang, Xiuxia Li.
Project administration: Juan Ling.
Resources: Jingyun Zhang, Long Ge.
Software: Zhitong Bing, Jinhui Tian, Jingyun Zhang, Long Ge.
Supervision: Kehu Yang.
Validation: Zhitong Bing.
Visualization: Jinhui Tian, Yumin Li.
Writing – original draft: Peng Chang, Zhitong Bing, Yumin Li.
Writing – review and editing: Peng Chang, Kehu Yang.
. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2018. CA Cancer J Clin 2018;68:7–30.
. Jemal A, Siegel R, Ward E, et al. Cancer Statistics, 2007. CA Cancer J Clin 2007;57:43–66.
. Christinat Y, Krek W. Integrated genomic analysis identifies subclasses and prognosis
signatures of kidney cancer. Oncotarget 2015;6:10521–31.
. Takahashi M, Rhodes DR, Furge KA, et al. Gene expression profiling of clear cell renal cell carcinoma
: gene identification and prognostic classification. Proc Natl Acad Sci U S A Proc Natl Acad Sci U S A 2001;98:9754–9.
. Kosari F, Parker AS, Kube DM, et al. Clear cell renal cell carcinoma
: gene expression analyses identify a potential signature for tumor aggressiveness. Clin Cancer Res 2005;11:5128.
. Sültmann H, Heydebreck AV, Huber W, et al. Gene expression in kidney cancer is associated with cytogenetic abnormalities, metastasis formation, and patient survival. Clin Cancer Res 2005;11(2 pt 1):646.
. Yao M, Tabuchi H, Nagashima Y, et al. Gene expression analysis of renal carcinoma: adipose differentiation-related protein as a potential diagnostic and prognostic biomarker for clear-cell renal carcinoma. J Pathol 2005;205:377.
. Zhao H, Ljungberg B, Grankvist K, et al. Gene expression profiling predicts survival in conventional renal cell carcinoma. PLoS Med 2006;3:e13.
. Yao M, Huang Y, Shioi K, et al. A three-gene expression signature model to predict clinical outcome of clear cell renal carcinoma. Int J Cancer 2008;123:1126–32.
. Mertz K, Demichelis FA, Hirsch M, et al. Association of cytokeratin 7 and 19 expression with genomic stability and favorable prognosis
in clear cell renal cell cancer. Int J Cancer 2008;123:569.
. Heinzelmann J, Henning B, Sanjmyatav J, et al. Specific miRNA signatures are associated with metastasis and poor prognosis
in clear cell renal cell carcinoma
. World J Urol 2011;29:367–73.
. Cancer Genome Atlas Research NetworkComprehensive molecular characterization of clear cell renal cell carcinoma
. Nature 2013;499:43–9.
. Brooks SA, Brannon AR, Parker JS, et al. ClearCode34: a prognostic risk predictor for localized clear cell renal cell carcinoma
. Eur Urol 2014;66:77.
. Gulati S, Martinez P, Joshi T, et al. Systematic evaluation of the prognostic impact and intratumour heterogeneity of clear cell renal cell carcinoma
biomarkers. Eur Urol 2014;66:936–48.
. Heinzelmann J, Unrein A, Wickmann U, et al. MicroRNAs with prognostic potential for metastasis in clear cell renal cell carcinoma
: a comparison of primary tumors and distant metastases. Ann Surg Oncol 2014;21:1046–54.
. Fu H, Liu Y, Xu L, et al. Galectin-9 predicts postoperative recurrence and survival of patients with clear-cell renal cell carcinoma. Tumour Biol 2015;36:5791–9.
. Ge YZ, Wu R, Xin H, et al. A tumor-specific microRNA signature predicts survival in clear cell renal cell carcinoma
. J Cancer Res Clin Oncol 2015;141:1291–9.
. Kim HL, Halabi S, Li P, et al. A molecular model for predicting overall survival in patients with metastatic clear cell renal carcinoma: results from CALGB 90206 (Alliance). EBioMedicine 2015;2:1814–20.
. Rini B, Goddard A, Knezevic D, et al. A 16-gene assay to predict recurrence after surgery in localised renal cell carcinoma: development and validation studies. Lancet Oncol 2015;16:676–85.
. Tang K, Xu H. Prognostic value of meta-signature miRNAs in renal cell carcinoma: an integrated miRNA expression profiling analysis. Sci Rep 2015;5:10272.
. Zhan Y, Guo W, Zhang Y, et al. A five-gene signature predicts prognosis
in patients with kidney renal clear cell carcinoma. Comput Math Methods Med 2015;2015:1–7.
. Boguslawska J, Kedzierska H, Poplawski P, et al. Expression of genes involved in cellular adhesion and ECM-remodelling correlates with poor survival of renal cancer patients. J Urol 2016;195:1892–902.
. Dai J, Lu Y, Wang J, et al. A four-gene signature predicts survival in clear-cell renal-cell carcinoma. Oncotarget 2016;7:82712.
. de Velasco G, Culhane AC, Fay AP, et al. Molecular subtypes improve prognostic value of international metastatic renal cell carcinoma database consortium prognostic model. Oncologist 2017;22:286–92.
. Ge YZ, Wu R, Xin H, et al. A tumor-specific microRNA signature predicts survival in clear cell renal cell carcinoma
. J Cancer Res Clin Oncol 2015;141:1291.
. Wu X, Weng L, Li X, et al. Identification of a 4-microRNA signature for clear cell renal cell carcinoma
metastasis and prognosis
. PLoS One 2012;7:e35661.
. Liang B, Zhao J, Wang X. A three-microRNA signature as a diagnostic and prognostic marker in clear cell renal cancer: an in silico analysis. PLoS One 2017;12:e0180660.
. Ran L, Liang J, Deng X, et al. miRNAs in prediction of prognosis
in clear cell renal cell carcinoma
. BioMed Res Int 2017;2017:1–6.
. Wuttig D, Zastrow S, Füssel S, et al. CD31, EDNRB and TSPAN7 are promising prognostic markers in clear-cell renal cell carcinoma revealed by genome-wide expression analyses of primary tumors and metastases. Int J Cancer 2012;131:E693–704.
. Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 2015;43:e47.
. Agarwal V, Bell GW, Nam J-W, et al. Predicting effective microRNA target sites in mammalian mRNAs. eLife 2015;4:e05005.
. Enright AJ, John B, Gaul U, et al. MicroRNA targets in Drosophila. Genome Biol 2003;5:R1.
. Yu G, Wang LG, Han Y, et al. clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters. OMICS 2012;16:284–7.
. Kathrin P, Jan S, Michaela N, et al. How microRNA and transcription factor co-regulatory networks affect osteosarcoma cell proliferation. PLoS Comput Biol) 2013;9:e1003210.
. Gentleman RC, Carey VJ, Bates DM, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004;5:1–6.
. Federico A, Rienzo M, Abbondanza C, et al. Pan-cancer mutational and transcriptional analysis of the integrator complex. Int J Mol Sci 2017;18:936.
. Morris MR, Ricketts CJ, Gentle D, et al. Genome-wide methylation analysis identifies epigenetically inactivated candidate tumour suppressor genes in renal cell carcinoma. Oncogene 2011;30:1390–401.
. Eckel-Passow JE, Serie DJ, Bot BM, et al. ANKS1B is a smoking-related molecular alteration in clear cell renal cell carcinoma
. BMC Urol 2014;14:14.
. Slattery ML, Herrick JS, Mullany LE, et al. An evaluation and replication of miRNAs with disease stage and colorectal cancer-specific mortality. Int J Cancer 2015;137:428–38.
. Tsianos EV, Katsanos KH, Christodoulou D, et al. The epidemiological profile of inflammatory bowel disease in different parts of North-West Greece. Ann Gastroenterol 2005;18:434–40.
. Fialho A, Fialho A, Shabbir A, et al. Su1812 renal cancer is associated with the use of immunomodulators in patients with inflammatory bowel disease. Gastroenterology 2016;150:S559–60.
Keywords:Copyright © 2018 The Authors. Published by Wolters Kluwer Health, Inc. All rights reserved.
clear cell renal cell carcinoma; Cox regression; gene regulatory network; least absolute shrinkage and selection operator; prognosis