Gene Expression Profiles in Cancers and Their Therapeutic Implications : The Cancer Journal

Secondary Logo

Journal Logo

Review Articles

Gene Expression Profiles in Cancers and Their Therapeutic Implications

Creighton, Chad J. PhD

Author Information
The Cancer Journal 29(1):p 9-14, 1/2 2023. | DOI: 10.1097/PPO.0000000000000638


For more than 20 years, the research community has extensively profiled human cancers for gene expression, with the associated data representing thousands of studies being made available in the public domain. Of the various “-omics” levels in cancer that can be profiled, transcriptomics would have the most data generated to date, given the early adoption by academic laboratories of DNA microarrays, starting in the late 1990s.1,2 With the advent of next-generation sequencing,3 RNA sequencing (RNA-seq) as a transcriptomics platform has become increasingly common. Gene expression would include protein as well as mRNA, where the two may not always be strongly correlated.4,5 Historically, proteomics profiling has represented additional challenges over transcriptomics, given the diverse chemistries that proteins represent, requiring experienced laboratories. Reverse-phase protein arrays—typically representing 150 to 300 targeted proteins—have been more widely adopted as a proteomics profiling platform in recent years.6 Also, recent technological advancements in mass spectrometry–based proteomics technologies, profiling thousands of proteins, have accelerated its application to study greater and greater numbers of cancer specimens.7,8

In addition to gene expression profiling data generated by individual laboratories for smaller and more independent studies, major team science efforts have generated multi-omics data on thousands of human tumors of various cancer types defined by tumor lineage or histology. The Cancer Genome Atlas (TCGA) consortium, which went from 2006 to 2018, generated multi-omics data, including RNA-seq and RPPA proteomic data, on more than 10,000 human tumors.9,10 Parallel to TCGA efforts focused mainly within the United States, the International Cancer Genomic Consortium carried out multi-omics profiling of thousands of cancers on a similar scale, with the cooperation of multiple countries.11 In recent years, the Clinical Proteomic Tumor Analysis Consortium and the International Cancer Proteogenome Consortium have generated multi-omics data on more than 2000 human cancers,5 including proteomics by mass spectrometry platform.

The vast amount of gene expression profiling data made available by published studies and consortiums represents a most valuable resource for ongoing studies. As no original study can comprehensively mine an expression profile data set for all genes of potential relevance, future studies may analyze previously published data with different questions in mind from those of the original authors. This review provides a broad overview of gene expression profiling in cancer and the types of findings made using these data. The figures of this review showcase specific examples of accessing public cancer gene expression data sets and generating unique views of the data and the resulting genes of interest. Due partly to space constraints, this review focuses on expression profiling of bulk tumors and cell lines, where single-cell RNA sequencing (scRNA-seq) represents another expression platform profiling individual cells within a tumor.12


Due in part to the advent of gene expression profiling technologies, it is now universally understood that multiple and distinct molecular subtypes would exist within any given cancer type as defined by tissue of origin. Early studies of breast cancer using DNA microarrays13,14 revealed 5 major gene expression–based subtypes: luminal A, luminal B, ERBB2+, basal-like, and normal-like. These subtypes reflected previous observations of breast cancer subtypes based on histology,13 with the luminal subtypes expressing the estrogen receptor, denoting sensitivity to estrogen therapy, and the ERBB2+ subtype expressing the HER2 receptor, denoting sensitivity to therapies blocking HER2. Breast cancer might represent the most well-known example of molecular subtypes having therapeutic implications. Gene expression profiling of other tissue-based cancer types has also defined molecular subtypes existing within these diseases. For example, for most cancer types studied by TCGA consortium, expression-based subtypes could be defined.15,16 These subtypes may involve histologic features of the cancer cells (e.g., basal, luminal, or squamous characteristics), cancer cell differentiation level, associated DNA-level mutations, or infiltration of noncancer cells (including immune cells or fibroblasts).

Beyond identifying molecular subtypes within tissue-based cancer types, pan-cancer analyses can define subtypes that may either align closely with cell or tissue of origin9,17 or would transcend tumor lineage.5,15,18,19 One of the advantages of team science efforts such as TCGA is that tumors from different cancer types are often profiled by the same laboratory using the same analytical platform. This aspect should allow cross-cancer–type analyses defining molecular subtypes and associated pathways relevant to multiple cancer types. Figure 1 provides an example of using TCGA data to define pan-cancer molecular subtypes, reflecting the tissue of origin (Fig. 1A) or transcending tissue of origin (Fig. 1B), depending on the analytical approach used. In our pan-cancer study of TCGA RNA-seq data,15 we classified 10,224 cancers, representing 32 major types, into 10 molecular-based subtypes or “classes,” whereby we first computationally removed expression patterns representing dominant tissue or histologic effects. For example, one of our pan-cancer subtypes expressed neuroendocrine markers such as CHGA. Another subtype represented basal-like breast cancer and MYC expression. Two of our subtypes expressed mesenchymal markers (e.g., VIM). Another subtype expressed immune checkpoint pathway markers (e.g., CD274) and molecular signatures of immune infiltrates. Using mass spectrometry–based proteomics data from Clinical Proteomic Tumor Analysis Consortium and International Cancer Proteogenome Consortium, we could similarly identify pan-cancer subtypes reflected in the mRNA data, but with notable exceptions.5,19 For example, a proteomic-based subtype expressed proteins in the complement pathway, distinct from the subtype expressing lymphocytic markers.

Pan-cancer molecular subtypes as identified using different analytical approaches. A, Across 9716 tumors represented in TCGA data sets, TCGA Network previously defined 28 pan-cancer subtypes closely following the cancer tissue of origin.9 With the tumors ordered by molecular subtype, the heat map shows differential mRNA expression patterns (values normalized across all cancers to SDs from the median) for a select set of genes representing pathways of particular interest. CD274 indicates PD-L1 gene and immunotherapy target; CHGA, marker of neuroendocrine tumors; HIF1A, transcription factor inducing hypoxia; MKI67, proliferation marker; MYC, oncogene; VIM, vimentin gene and marker of mesenchymal cells; ZEB1, transcription factor activating epithelial-mesenchymal transition. B, Using an alternate analytical approach to define molecular subtypes that would transcend tumor lineage and tissue of origin, we could classify TCGA tumors into 10 major subtypes.15 The heat map shows differential mRNA expression patterns (values normalized within each cancer type to SDs from the median) for the same set of genes from part A. Whereas TCGA RNA-seq data sets allow for cross-cancer type comparisons, as carried out in defining the subtypes in part A,9 an alternative approach to molecular classification, represented in part B, involves computationally subtracting the gene expression differences between cancer types.18 As applied to TCGA RNA-seq data, this alternative approach had the effect of consolidating the individual subtypes that might be discoverable in individual cancer types into super-types or pan-cancer “classes” that transcend tissue or histology distinctions.


Gene expression profiles of tumor samples taken from the initial surgery can predict the patient's eventual outcome. Early studies first demonstrated this means of prognostication in breast cancer, establishing a 70-gene prognosis profile that could segregate patients into good versus poor prognosis,20,21 consistent with patient follow-up data. Studies from other groups could establish prognostic gene signatures in most other cancer types, including lung,22,23 prostate,24,25 colon,26 medulloblastoma,27 leukemia,28 lymphoma,29 and so on. Gene signature information has generally represented an independent factor in predicting disease outcome, along with relevant clinical variables such as age, tumor size, histology, pathological grade, and so on.20 Given the clinical application of cancer patient prognosis, commercial gene panel assays with genes selected based on gene expression profiling data have been developed and approved for clinical use, such as the Oncotype DX assays for breast,30 colon,31 and prostate32 cancers. A prognostic gene signature may consist of a discrete number of genes, often a function of statistical methods and cutoffs. At the same time, many more genes not included in a given signature may also have prognostic information.

In addition to their potential for clinical application, prognostic gene signatures can provide molecular clues regarding the biological drivers and pathways underlying aggressive cancers. Genes that may inform tumor biology would not be limited to the top ~100 most significant genes but could additionally involve hundreds of genes that meet statistical significance for survival association. An example of gaining insight from gene survival correlates involves the author's work with TCGA consortium in clear cell renal cell carcinoma,33 where we defined molecular correlates of patient survival at mRNA, microRNA, protein, and DNA methylation levels. When viewed in the context of metabolism, aggressive renal cancers demonstrated evidence of a metabolic shift, involving downregulation of TCA cycle genes, decreased AMPK and PTEN, upregulation of the pentose phosphate pathway and glutamine transporter genes, and increased acetyl-CoA carboxylase.33 Along these lines, Figure 2 of this review shows a pathway diagram representing core metabolic pathways, with the genes denoting any survival associations at the mRNA level as observed in breast cancer,35 clear cell renal cell carcinoma,33 or across the entire TCGA pan-cancer data set.36 Other pathways would underlie prognostic gene signatures, which might be uncovered, for example, by domain knowledge or by using methods and software such as gene set enrichment analysis.37

Gene expression correlates of cancer patient survival involving metabolic pathways. Gene expression correlates of patient survival can be examined for clues as to the molecular biology underlying the more aggressive cancers. Pathway diagram representing core metabolic pathways,33,34 with corresponding mRNA correlations with patient survival. Red and blue shading represent the association of increased mRNA expression with worse or better survival, respectively, by univariate Cox. For each gene, survival correlations across 3 cancer expression profiling data sets are represented: breast cancer data set from Pereira et al.35 (left, n = 1904 patients, overall survival endpoint), renal cell carcinoma data set from TCGA (middle, n = 417 patients, overall survival endpoint), pan-cancer data set from TCGA (right, n = 10,152 patients, overall survival endpoint, P values correcting for cancer type).


Cancer cell lines have historically been the most commonly used models for studying cancer biology. Using in vitro cell line models would be a typical first step in validating functional gene targets or drug responses in the laboratory, where results may be further investigated using more complicated in vivo models. Extensive molecular data (including mRNA, protein, copy number alteration, and somatic mutation), gene knockout data, and drug response data have been generated across more than 1000 human cancer cell lines. These data are available via team science efforts, including the Cancer Cell Line Encyclopedia38,39 and the Genomics of Drug Sensitivity in Cancer40 projects. The Genomics of Drug Sensitivity in Cancer data sets include half maximal inhibitory concentration (IC50) data on more than 400 drugs across cell lines, denoting which cell lines are most or least sensitive to a given drug in vitro. Gene expression data may be integrated with drug IC50 data to define gene correlates of drug response. The Cancer Cell Line Encyclopedia data include corresponding CRISPR and RNAi data,41,42 denoting which cell lines depend on a specific gene for proliferation. These resources may be combined to identify new gene targets with functional roles in a subset of cell lines for follow-up functional studies. For example, the ERBB2 gene has high expression in cell lines most sensitive to either HER2 inhibitors40 or loss of HER2 function. Candidate gene targets involving other drugs and other cell lines may be similarly identified.


Cancer cell lines represent models that would capture some but not all aspects of cancer cells within patient tumors. Breast cancer perhaps provides the best-known examples of therapeutically predictive markers, namely, estrogen receptor and HER2 (ERBB2), with high expression predicting patient response to therapies targeting these receptor pathways. Gene expression profiling data sets of human tumors, combined with treatment data, including patient response, could yield signatures of therapeutic response involving up to hundreds of genes. Patient treatment response data may include short-term as well as long-term responses. With long-term response data, there is a need to distinguish gene markers that would be therapeutically predictive versus those that are merely prognostic. In identifying markers of treatment response, numerous studies have carried out gene expression profiling of pretreatment breast tumor biopsies from patients treated with neoadjuvant chemotherapy, with patient response recorded at the end of treatment.43–49 Many of the gene expression markers from these studies are associated with basal-like breast cancer, as this subtype tends to be more responsive to chemotherapy.50 For Figure 3 of this review, we assembled a compendium of 8 different public breast cancer expression data sets. We used this to define a top set of genes correlated with pathologic chemotherapy response, independent of molecular subtype (Fig. 3A). By enrichment analysis,53 these genes represent functional gene categories of interest to cancer biology (Fig. 3B). In addition, one can combine expression data from human tumors with expression data from cell lines having drug response data to identify treatment response markers that arise in both settings.46

Gene expression correlates of therapeutic response to chemotherapy in breast cancer patients. A, Numerous studies have carried out gene expression profiling of pretreatment breast tumor biopsies from patients treated with neoadjuvant chemotherapy, with patient response recorded at the end of treatment.43–49 As part of this review, we assembled a compendium of 8 separate data sets from the above studies, representing 1240 tumor expression profiles (GEO accession numbers provided in Data File S1, All data sets were generated using the same Affymetrix gene array platform. In the same manner as carried out in our previous studies,5,15,51 we transformed log2 gene expression values to SDs from the median within each data set, removing batch effect differences among data sets. We assessed the correlation of expression with pathologic chemotherapy response (path CR) for each gene feature after correcting for Pam50 subtype51 by linear modeling. The heat map shows expression patterns for a top set of 295 gene features (P < 0.001, out of 22,269 total). B, Selected significantly enriched gene ontology terms52 within the genes higher in breast tumors from patients with path CR (from part A). Enrichment P values and numbers of genes in the path CR-associated gene set are indicated for each gene ontology term. Enrichment P values by one-sided Fisher exact test.


Expression profiling data can be integrated with DNA-level somatic mutation data to examine the functional consequences of specific mutations. For example, gene copy alterations in cancer directly and widely impact gene expression, as these alterations represent a dosage effect in how much a gene can be transcribed.54 Molecular pathways in cancer involve multiple genes and pathway intermediates. For a given pathway, somatic mutation—including point mutations, insertions-deletions, and copy number alterations—may impact different genes in different tumors.55 The gene expression level often reflects the downstream consequences of mutation, where the diverse set of alterations at the pathway signaling level would converge upon the same set of transcriptionally regulated genes.56–59 Cell line models can identify the top set of genes altered in expression when a specific pathway is experimentally perturbed. These genes can then define pathway signatures by which tumors or cell lines with expression data may be scored, with higher signature scoring indicative of higher pathway activity.60 Gene signatures of pathways can also help discover unexpected connections involving genes previously unrealized or underappreciated as members of the given pathway. We demonstrated this approach in our multi-omics survey of the PI3K/AKT/mTOR pathway across TCGA cancers, whereby IDH1 and VHL mutations, previously underappreciated as impacting the pathway, were strongly associated with increased pathway activation.58

The impact of somatic alterations on gene expression is not limited to the gene coding regions. The noncoding genome provides the regulatory framework of the coding genome, and noncoding somatic alterations often impact the expression of nearby genes. One well-known example of this involves TERT, where specific point mutations or structural rearrangement breakpoints that occur directly upstream of TERT, can result in up-regulation of the gene.61–63 Recently, the Pan-Cancer Analysis of Whole Genomes consortium comprehensively surveyed the noncoding somatic landscape of 2658 tumors from TCGA and the International Cancer Genomic Consortium, with 1220 of these tumors having RNA-seq data.64–66 Few genes with “hotspot” noncoding mutations (i.e., noncoding mutations at a specific coordinate that recurrently occur across many tumors) were found, which included TERT.66 On the other hand, somatic structural variation showed a widespread impact on the transcription of hundreds of genes, where structural variant breakpoints may fall at different coordinates in relation to the gene but which can alter regulation by various mechanisms, including enhancer hijacking and TAD disruption.65 In addition, noncoding point mutations that fall within a wider genomic region, as opposed to recurrent hotspot mutations targeting a specific nucleotide, can similarly impact the expression of certain genes.67


To date, the vast majority of tumors with expression data in the public domain or available through large-scale efforts such as TCGA are primary tumors. Metastatic tumors, on the other hand, represent a more advanced cancer that has left its primary site to grow elsewhere in the body. By some estimates, as much as 90% of cancer deaths result from metastasis.68 There is a need to understand better the genes and processes involved in metastasis. Public repositories such as the Gene Expression Omnibus69 provide expression profiling data on tumor metastases from individual published studies. These include data allowing for paired metastasis versus primary comparisons within the same patient,70,71 to help assess the changes associated with metastatic cancer cells. Pan-cancer multi-omics initiatives to profile tumor metastases from multiple cancer types include the recent MET50072 and POG57073 studies of 500 and 570 patients, respectively. The POG570 data sets include patient treatment information. As advanced and metastatic tumors involve patients who have typically been heavily treated at this stage, these data offer the opportunity to assess gene expression features associated with specific therapies.73,74


More and more gene expression profiling data on cancers will continue to go into the public domain. Expression profiling data from different studies representing different cellular contexts may be reanalyzed, with the individual results sets brought together in interesting ways to gain insights into cancer biology and therapeutic approaches. Data from cancer cell lines or from PDX models75 could be integrated with data from human tumors, for example, to identify gene targets for follow-up bench experiments. Bulk tumor expression profiles represent a mixture of cancer and noncancer cells. By profiling individual cells within the tumor, the scRNA-seq platform provides insights into the tumor cell populations and how these may change over time or with treatment. At the same time, scRNA-seq studies often do not involve many samples or patients, where a study may need large numbers to establish robust associations. With all the available expression data, more sophisticated data portals could make the results available and accessible to noncomputational researchers, for example, making data for gene-level results available by a point-and-click user interface.76,77

Data File S1 ( Results represented in the figures. Results include pan-cancer molecular subtypes and associated genes (Fig. 1), mRNA-level correlates of patient survival involving metabolism-related genes in breast cancer, clear cell renal cell carcinoma, and pan-cancer data sets (Fig. 2), and gene expression correlates of pathologic chemotherapy response (pCR) in breast tumors (correcting for PAM50 subtype), along with Gene Expression Omnibus data sets and sample profiles analyzed.


The author thanks Yiqun Zhang for technical assistance regarding the gene expression correlation with therapeutic response to chemotherapy in breast cancer patients.


1. Lockhart DJ, Dong H, Byrne MC, et al. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol. 1996;14:1675–1680.
2. Schena M, Shalon D, Davis RW, et al. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995;270:467–470.
3. Fullwood MJ, Wei CL, Liu ET, et al. Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses. Genome Res. 2009;19:521–532.
4. Zhang B, Wang J, Wang X, et al. Proteogenomic characterization of human colon and rectal cancer. Nature. 2014;513:382–387.
5. Zhang Y, Chen F, Chandrashekar DS, et al. Proteogenomic characterization of 2002 human cancers reveals pan-cancer molecular subtypes and associated pathways. Nat Commun. 2022;13:2669.
6. Creighton CJ, Huang S. Reverse phase protein arrays in signaling pathways: a data integration perspective. Drug Des Devel Ther. 2015;9:3519–3527.
7. Macklin A, Khan S, Kislinger T. Recent advances in mass spectrometry based clinical proteomics: applications to cancer research. Clin Proteomics. 2020;17:17.
8. Mani DR, Krug K, Zhang B, et al. Cancer proteogenomics: current impact and future prospects. Nat Rev Cancer. 2022;22:298–313.
9. Hoadley KA, Yau C, Hinoue T, et al. Cell-of-origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer. Cell. 2018;173:291–304.e6.
10. Ding L, Bailey MH, Porta-Pardo E, et al. Perspective on oncogenic processes at the end of the beginning of cancer genomics. Cell. 2018;173:305–320.10.
11. Hudson TJ, Anderson W, Artez A, et al; International Cancer Genome Consortium. International network of cancer genome projects. Nature. 2010;464:993–998.
12. Bykov Y, Kim SH, Zamarin D. Preparation of single cells from tumors for single-cell RNA sequencing. Methods Enzymol. 2020;632:295–308.
13. Perou CM, Sørlie T, Eisen MB, et al. Molecular portraits of human breast tumours. Nature. 2000;406:747–752.
14. Sørlie T, Perou CM, Tibshirani R, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A. 2001;98:10869–10874.
15. Chen F, Zhang Y, Gibbons DL, et al. Pan-cancer molecular classes transcending tumor lineage across 32 cancer types, multiple data platforms, and over 10,000 cases. Clin Cancer Res. 2018;24:2182–2193.
16. Martínez E, Yoshihara K, Kim H, et al. Comparison of gene expression patterns across 12 tumor types identifies a cancer supercluster characterized by TP53 mutations and cell cycle defects. Oncogene. 2015;34:2732–2740.
17. Hoadley KA, Yau C, Wolf DM, et al. Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin. Cell. 2014;158:929–944.
18. Akbani R, Ng PK, Werner HM, et al. A pan-cancer proteomic perspective on The Cancer Genome Atlas. Nat Commun. 2014;5:3887.
19. Chen F, Chandrashekar DS, Varambally S, et al. Pan-cancer molecular subtypes revealed by mass-spectrometry-based proteomic characterization of more than 500 human cancers. Nat Commun. 2019;10:5679.
20. van de Vijver MJ, He YD, van't Veer LJ, et al. A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med. 2002;347:1999–2009.
21. van 't Veer LJ, Dai H, van de Vijver MJ, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002;415:530–536.
22. Bhattacharjee A, Richards WG, Staunton J, et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci U S A. 2001;98:13790–13795.
23. Beer DG, Kardia SLR, Huang CC, et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat Med. 2002;8:816–824.
24. Glinsky GV, Glinskii AB, Stephenson AJ, et al. Gene expression profiling predicts clinical outcome of prostate cancer. J Clin Invest. 2004;113:913–923.
25. Yu YP, Landsittel D, Jing L, et al. Gene expression alterations in prostate cancer predicting tumor aggression and preceding development of malignancy. J Clin Oncol. 2004;22:2790–2799.
26. Wang Y, Jatkoe T, Zhang Y, et al. Gene expression profiles and molecular markers to predict recurrence of Dukes' B colon cancer. J Clin Oncol. 2004;22:1564–1571.
27. Pomeroy SL, Tamayo P, Gaasenbeek M, et al. Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature. 2002;415:436–442.
28. Chiaretti S, Li X, Gentleman R, et al. Gene expression profile of adult T-cell acute lymphocytic leukemia identifies distinct subsets of patients with different response to therapy and survival. Blood. 2004;103:2771–2778.
29. Rosenwald A, Wright G, Chan W, et al. The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N Engl J Med. 2002;346:1937–1947.
30. Paik S, Shak S, Tang G, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004;351:2817–2826.
31. Gray RG, Quirke P, Handley K, et al. Validation study of a quantitative multigene reverse transcriptase–polymerase chain reaction assay for assessment of recurrence risk in patients with stage II colon cancer. J Clin Oncol. 2011;29:4611–4619.
32. Knezevic D, Goddard AD, Natraj N, et al. Analytical validation of the Oncotype DX prostate cancer assay—a clinical RT-PCR assay optimized for prostate needle biopsies. BMC Genomics. 2013;14:690.
33. The Cancer Genome Atlas Research Network. Comprehensive molecular characterization of clear cell renal cell carcinoma. Nature. 2013;499:43–49.
34. Monsivais D, Vasquez YM, Chen F, et al. Mass-spectrometry–based proteomic correlates of grade and stage reveal pathways and kinases associated with aggressive human cancers. Oncogene. 2021;40:2081–2095.
35. Pereira B, Chin SF, Rueda OM, et al. The somatic mutation profiles of 2,433 breast cancers refines their genomic and transcriptomic landscapes. Nat Commun. 2016;7:11479.
36. Liu J, Lichtenberg T, Hoadley KA, et al. An integrated TCGA pan-cancer clinical data resource to drive high-quality survival outcome analytics. Cell. 2018;173:400–416.e11.
37. Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102:15545–15550.
38. Ghandi M, Huang FW, Jané-Valbuena J, et al. Next-generation characterization of the cancer Cell Line Encyclopedia. Nature. 2019;569:503–508.
39. Barretina J, Caponigro G, Stransky N, et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483:603–607.
40. Garnett MJ, Edelman EJ, Heidorn SJ, et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature. 2012;483:570–575.
41. Dempster JM, Boyle I, Vazquez F, et al. Chronos: a cell population dynamics model of CRISPR experiments that improves inference of gene fitness effects. Genome Biol. 2021;22:343.
42. Tsherniak A, Vazquez F, Montgomery PG, et al. Defining a cancer dependency map. Cell. 2017;170:564–576.e16.
43. Horak CE, Pusztai L, Xing G, et al. Biomarker analysis of neoadjuvant doxorubicin/cyclophosphamide followed by ixabepilone or paclitaxel in early-stage breast cancer. Clin Cancer Res. 2013;19:1587–1595.
44. Iwamoto T, Bianchini G, Booser D, et al. Gene pathways associated with prognosis and chemotherapy sensitivity in molecular subtypes of breast cancer. J Natl Cancer Inst. 2011;103:264–272.
45. Hatzis C, Pusztai L, Valero V, et al. A genomic predictor of response and survival following taxane-anthracycline chemotherapy for invasive breast cancer. JAMA. 2011;305:1873–1881.
46. Shen K, Qi Y, Song N, et al. Cell line derived multi-gene predictor of pathologic response to neoadjuvant chemotherapy in breast cancer: a validation study on US Oncology 02-103 clinical trial. BMC Med Genomics. 2012;5:51.
47. Korde LA, Lusa L, McShane L, et al. Gene expression pathway analysis to predict response to neoadjuvant docetaxel and capecitabine for breast cancer. Breast Cancer Res Treat. 2010;119:685–699.
48. Prat A, Bianchini G, Thomas M, et al. Research-based PAM50 subtype predictor identifies higher responses and improved survival outcomes in HER2-positive breast cancer in the NOAH study. Clin Cancer Res. 2014;20:511–521.
49. Miyake T, Nakayama T, Naoi Y, et al. GSTP1 expression predicts poor pathological complete response to neoadjuvant chemotherapy in ER-negative breast cancer. Cancer Sci. 2012;103:913–920.
50. Nunnery SE, Mayer IA, Balko JM. Triple-negative breast cancer: breast tumors with an identity crisis. Cancer J. 2021;27:2–7.
51. Creighton CJ. The molecular profile of luminal B breast cancer. Biologics. 2012;6:289–297.
52. Ashburner M, Ball CA, Blake JA, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–29.
53. Creighton CJ, Nagaraja AK, Hanash SM, et al. A bioinformatics tool for linking gene expression profiling results with public databases of microRNA target predictions. RNA. 2008;14:2290–2296.
54. Pollack JR, Sørlie T, Perou CM, et al. Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc Natl Acad Sci U S A. 2002;99:12963–12968.
55. Sanchez-Vega F, Mina M, Armenia J, et al. Oncogenic signaling pathways in The Cancer Genome Atlas. Cell. 2018;173:321–337.
56. Cancer Genome Atlas Research Network. Comprehensive molecular profiling of lung adenocarcinoma. Nature. 2014;511:543–550.
57. Cancer Genome Atlas Research Network. Comprehensive molecular portraits of human breast tumours. Nature. 2012;490:61–70.
58. Zhang Y, Kwok-Shing Ng P, Kucherlapati M, et al. A pan-cancer proteogenomic atlas of PI3K/AKT/mTOR pathway alterations. Cancer Cell. 2017;31:820–832.e3.
59. Hanahan D, Weinberg RA. The hallmarks of cancer. Cell. 2000;100:57–70.
60. Bild AH, Yao G, Chang JT, et al. Oncogenic pathway signatures in human cancers as a guide to targeted therapies. Nature. 2006;439:353–357.
61. Huang FW, Hodis E, Xu MJ, et al. Highly recurrent TERT promoter mutations in human melanoma. Science. 2013;339:957–959.
62. Horn S, Figl A, Rachakonda P, et al. TERT promoter mutations in familial and sporadic melanoma. Science. 2013;339:959–961.
63. Davis CF, Ricketts CJ, Wang M, et al. The somatic genomic landscape of chromophobe renal cell carcinoma. Cancer Cell. 2014;26:319–330.
64. ICGC-TCGA Pan-Cancer Analysis of Whole Genomes Network. Pan-cancer analysis of whole genomes. Nature. 2020;578:82–93.
65. Zhang Y, Chen F, Fonseca N, et al. High-coverage whole-genome analysis of 1220 cancers reveals hundreds of genes deregulated by rearrangement-mediated cis-regulatory alterations. Nat Commun. 2020;11:736.
66. Rheinbay E, Nielsen MM, Abascal F, et al. Analyses of non-coding somatic drivers in 2,658 cancer whole genomes. Nature. 2020;578:102–111.
67. Chen F, Zhang Y, Creighton C. Systematic identification of non-coding somatic single nucleotide variants associated with altered transcription and DNA methylation in adult and pediatric cancers. NAR Cancer. 2021;3:zcab001.
68. Chaffer CL, Weinberg RA. A perspective on cancer cell metastasis. Science. 2011;331:1559–1564.
69. Barrett T, Wilhite S, Ledoux P, et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res. 2013;41:D991–D995.
70. Cosgrove N, Varešlija D, Keelan S, et al. Mapping molecular subtype specific alterations in breast cancer brain metastases identifies clinically relevant vulnerabilities. Nat Commun. 2022;13:514.
71. Siegel M, He X, Hoadley K, et al. Integrated RNA and DNA sequencing reveals early drivers of metastatic breast cancer. J Clin Invest. 2018;128:1371–1383.
72. Robinson DR, Wu YM, Lonigro RJ, et al. Integrative clinical genomics of metastatic cancer. Nature. 2017;548:297–303.
73. Pleasance E, Titmuss E, Williamson L, et al. Pan-cancer analysis of advanced patient tumors reveals interactions between therapy and genomic landscapes. Nat Cancer. 2020;1:452–468.
74. Zhang Y, Chen F, Pleasance E, et al. Rearrangement-mediated cis-regulatory alterations in advanced patient tumors reveal interactions with therapy. Cell Rep. 2021;37:110023.
75. Sun H, Cao S, Mashl RJ, et al. Comprehensive characterization of 536 patient-derived xenograft models prioritizes candidates for targeted treatment. Nat Commun. 2021;12:5086.
76. Chandrashekar DS, Bashel B, Balasubramanya SAH, et al. UALCAN: a portal for facilitating tumor subgroup gene expression and survival analyses. Neoplasia. 2017;19:649–658.
77. Cerami E, Gao J, Dogrusoz U, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012;2:401–404.

cancer; gene expression profiling; proteomics; RNA-seq; transcriptomics

Supplemental Digital Content

Copyright © 2023 Wolters Kluwer Health, Inc. All rights reserved.