Cancers constitute a complex disease entity characterized by various genetic aberrations that comprise both somatic (acquired) and germline (inherited) mutations [1,2]. Whereas sporadic cancers have long been recognized to be caused by somatic mutations occurring in the genomes of the cells where the cancer originated from, familial or hereditary cancer syndromes are ultimately caused by inherited mutations. For example, approximately 13% of high-grade serous ovarian cancer is caused by germline mutations in BRCA1 and BRCA2 genes. However, most sporadic ovarian cancer cases can be attributed to somatic aberrations . The advent of next-generation sequencing (NGS) technologies has greatly facilitated research in cancer genetics and cancer genomics [4,5,6▪]. For example, studies applying whole-exome sequencing (WES) and whole-genome sequencing (WGS) have successfully identified inherited causal mutations for familial cancers such as familial pancreatic cancer, hereditary pheochromocytoma and familial melanoma [7,8▪▪,9]. Similarly, various somatic genetic aberrations, highly mutated genes (genes frequently harboring somatic mutations) and recurrent mutations have also been identified for various sporadic cancers such as uterine serous carcinoma, ovarian carcinoma and breast cancer [10▪▪,11▪▪,12]. Dissecting the somatic mutational profiles of the genomes of different cancers has been a major focus for cancer genetics research over the past decades because identifying the causal mutations is vitally important for the elucidation of the molecular and biological mechanisms underlying the development of cancer and because the genes harboring these lesions are also potential targets for drug development. Cancer-causing somatic mutations are also known as ‘driver mutations’ because they confer a clonal selective advantage upon cancer cells by virtue of their being causally implicated in carcinogenesis, for example, in cell proliferation/growth and apoptosis. This category of somatic lesion is to be distinguished from the remaining mutations which have no obvious functional role in cancer development (‘passenger mutations’) and which have arisen simply as a consequence of the greatly increased mutation rate characteristic of many tumors [1,2].
In addition to the research discoveries, applications of NGS in a diagnostic context have also become increasingly evident in cancer. NGS-based methods have been shown to be promising diagnostic tools by identifying the germline mutations underlying familial cancer syndromes such as Lynch syndrome and familial breast and ovarian cancer [13▪▪,14,15]. This was performed through a targeted sequencing approach of the panels of genes implicated in the various cancers. In addition, WGS has also been employed to identify complex chromosomal rearrangements of use in cancer diagnosis [16▪▪]. The rapid advances in NGS technology in terms of higher throughput and lower cost, together with the development of multiple genomic sequence enrichment methods, have contributed significantly to both the research and clinical applications of cancer genome sequencing. In addition to the high-throughput NGS technologies, the development of bench-top NGS instruments (with lower throughputs) has further enhanced the accessibility of NGS in a clinical setting . These bench-top NGS instruments are particularly appropriate when only the sequencing of a panel of genes is required (rather than a whole exome or genome) for a small number of patients, a commonplace scenario in the context of clinical diagnostics. The targeted sequencing approach is less complicated in terms of data analysis and interpretation, and in terms of ethical concerns in the clinical setting . As such, these technological advances have made different NGS-based approaches feasible, namely targeted sequencing, WES and WGS. In this review, we highlight and discuss recent developments in the research and clinical applications of cancer genome sequencing.
WGS and WES have been commonly applied to the study of the patterns of somatic mutation in a range of different cancers. Collectively, these studies have generated new insights into the mutational landscapes of various cancers and have resulted in the identification of a large number of recurring mutations (identical mutations detected in multiple samples) as well as many highly mutated genes (genes harboring mutations) in a considerable proportion of the samples . In contrast to WGS, only 1–2% of the entire genome is sequenced in WES; thus, the latter approach is less analytically challenging and cheaper per sample. As a result, WES has been applied to larger sample sizes of various cancers in an attempt to identify recurrent mutations and highly mutated genes. For example, in the context of cancers in gynecology and obstetrics, WES performed on 10 uterine serous carcinomas and their matched normal blood or tissue samples succeeded in identifying frequent somatic mutations in TP53, PIK3CA, FBXW7 and PPP2R1A[10▪▪]. More specifically, somatic mutations detected by WES were further validated by Sanger sequencing and the most frequent mutations tested in 66 additional uterine serous carcinomas. In addition to the WES data, 23 uterine serous carcinomas (including the 10 samples that were sequenced by WES) were also subjected to copy number analysis using single-nucleotide polymorphism arrays. Frequent amplification of the CCNE1 locus (which encodes cyclin E, a known substrate of FBXW7) and deletion of the FBXW7 locus were observed. Among 23 uterine serous carcinomas that were subjected to copy number analysis, seven tumors with FBXW7 mutations (four tumors with point mutations, three tumors with hemizygous deletions) did not exhibit CCNE1 amplification, whereas 13 tumors displayed either a molecular genetic alteration in FBXW7 or CCNE1 amplification. Similarly, approximately half of the uterine serous carcinomas were found to have genetic alterations in the PIK3CA locus, that is a PIK3CA mutation and/or PIK3CA amplification. The WES and copy number analysis have conclusively demonstrated the importance of the cyclin E-FBXW7 and phosphoinositide-3-kinase (PI3K) pathways (in addition to TP53) in the etiology of uterine serous carcinoma [10▪▪].
The high frequency of TP53 mutations was also observed in high-grade serous ovarian adenocarcinomas through WES of 316 ovarian cancers and matched normal samples; TP53 mutations were detected in almost all the tumors (96%) [11▪▪]. The mutations detected by WES that might be important to ovarian adenocarcinoma development were prioritized by searching for nonsynonymous or splice site mutations present at significantly increased frequencies relative to the background; comparing mutations identified in the WES with those in public databases, that is the Catalogue of Somatic Mutations in Cancer and Online Mendelian Inheritance in Man; and predicting the likely impact of the mutations on protein function. Through this prioritization scheme, nine genes were identified for which the number of nonsynonymous or splice site mutations was significantly greater than that expected on the basis of mutation distribution models. In addition to the well established genes, that is TP53, BRCA1 and BRCA2, the study also identified six other recurrently mutated genes including CSMD3, NF1, CDK12, FAT3, GABRA6 and RB1 which appear likely to have a functional role.
The concept of an integrative approach for a range of different omics data is not new, but in recent years it has resurfaced and become reinvigorated by technological advances that have made possible the ‘profiling of everything’ (i.e. the characterization of informational macromolecules at the genomic, transcriptomic and epigenomic levels) using microarray and/or NGS technologies. This has been exemplified in the ovarian cancer genome sequencing studies. In addition to the somatic mutational data generated by WES already discussed, the ovarian cancer study also performed microarray analyses to characterize mRNA expression, miRNA expression, DNA copy number and DNA promoter methylation for 489 tumors (of which 316 samples were subjected to WES) [11▪▪]. The availability of various omics datasets is required for integrative analysis, and the advantages of integrating the mutation and gene expression data have also become evident. Although GABRA6 and FAT3 were identified as significantly mutated genes through WES, these genes were not expressed in ovarian adenocarcinomas or fallopian tube tissue, and hence we may infer that they are less likely to be functionally important in the context of tumorigenesis. In addition to point mutations, the copy number analysis identified recurrent focal somatic copy number alterations, that is 63 regions of focal amplification with the most common focal amplifications encompassing CCNE1, MYC and MECOM, each of which was highly amplified in more than 20% of tumors. Fifty focal deletions were also identified and it was intriguing that known tumor suppressor genes such as PTEN, RB1 and NF1 were found within regions of homozygous deletions in at least 2% of the tumors. In addition to identifying various DNA sequence alterations, microarray gene expression analysis revealed four robust expression subtypes for high-grade serous ovarian cancer. Further, consideration of the DNA methylation data, specifically the analysis of ‘increased DNA methylation and reduced tumor expression’, identified 168 genes as being epigenetically silenced in high-grade serous ovarian cancer samples as compared with the fallopian tube control. DNA methylation was correlated with reduced gene expression across all samples, for example the AMT, CCL21 and SPARCL1 genes exhibited promoter hypermethylation in the vast majority of the tumors [11▪▪].
In addition to the gynecological cancers, breast cancer has also received significant attention from cancer genome sequencing studies with a view to characterize its mutational profile. For example, WES analysis on 100 primary breast cancers (79 estrogen receptor (ER) positive and 21 ER negative) identified numerous driver mutations . Somatic driver mutations (single nucleotide substitutions and small indels) were identified in known breast cancer genes as well as in several new cancer genes. Of these latter genes, ARID1B, CASP8, MAP3K1, MAP3K13, NCOR1, SMARCD1 and CDKN1B are potentially recessive cancer susceptibility genes as they harbored truncating mutations and were characterized by biallelic gene inactivation. More interesting is the finding of substantial variation in the total numbers of base substitutions and indels as well as considerable diversity in the mutational patterns between individual cases. This conveys an important message implying that multiple distinct mutational processes underlie the molecular development of breast cancer. In addition, it was also found that most somatic mutations in breast cancer genomes occur after the initiating driver event of neoplastic transformation, a deduction on the basis of the absence of the relationship between the total numbers of somatic base substitutions and the age at diagnosis . In similar vein, WES was applied to 103 human breast cancers to identify mutations and translocations across breast cancer subtypes . In addition to confirming known recurrent somatic mutations in PIK3CA, TP53, AKT1, GATA3 and MAP3K1, the study also identified recurrent mutations in the CBFB transcription factor gene and deletions of its partner, RUNX1. Furthermore, deep sequencing analysis applying WGS of 22 paired tumor/normal tissues revealed a recurrent MAGI3-AKT3 fusion which was found disproportionately in triple-negative breast cancer lacking estrogen and progesterone receptors and ERBB2 expression, thereby providing new potential therapeutic options. The MAGI3–AKT3 fusion leads to the constitutive activation of AKT kinase, which is abolished by treatment with an ATP-competitive AKT small-molecule inhibitor .
In addition to deciphering the mutational patterns of tumorigenesis, identifying biomarkers for predicting endocrine therapy response represents another important priority for both the research and clinical applications of cancer genome sequencing. The accurate prediction of nonresponders to tamoxifen or aromatase inhibitors among ER-positive patients remains a challenge. Ellis et al. reported the results of a WGS study performed on 46 patients in an attempt to correlate mutational profiles with phenotypes, clinical data and therapeutic breast cancer responsiveness. In 77 pretreatment tumor biopsies collected from ER-positive breast cancer patients in two studies of neoadjuvant aromatase inhibitor therapy, 46 and 31 cases were subjected to WGS and WES analysis, respectively. This led to the identification of 18 significantly mutated genes. Mutant MAP3K1 was associated with luminal A status, low-grade histology and low proliferation rates, whereas mutant TP53 was associated with the opposite pattern. Moreover, mutant GATA3 correlated with suppression of proliferation upon aromatase inhibitor treatment. However, larger genome sequencing studies with better statistical power are needed to identify predictors of endocrine therapy response because of extreme heterogeneity in ER-positive subtype .
In contrast to the sequencing of cancer genomes to identify somatic mutations, studies describing the sequencing of constitutional DNA to identify germline causal mutations for familial cancers are few in number [6▪]. Although familial cancer susceptibility genes such as BRCA1 and BRCA2 (breast and ovarian cancers), APC (familial adenomatous polyposis), DNA mismatch repair genes (Lynch syndrome) and CDH1 (hereditary diffuse gastric cancer) have been identified through traditional family linkage analysis and positional cloning approaches, these genes do not account for all the familial cancer cases. This highlights the fact that as-yet-to-be identified genes are likely to be responsible for the remaining cases that currently remain unexplained by reference to the known cancer genes [6▪]. Cancer genome sequencing provides new approaches and opportunities to identify genes underlying familial cancer syndromes. This is well exemplified by the case of familial melanoma, in which only two genes have been found to be responsible for familial melanoma, of which mutations in CDKN2A account for approximately 40% of familial cases, whereas mutations in CDK4 have been reported in a very small number of melanoma kindreds. However, WGS of probands from several melanoma families has identified a new gene, MITF, which harbors an intermediate germline risk variant, c.G1075A (p.E318K) . The association between this risk variant and melanoma was further confirmed by additional case–control studies. Similarly, the successful research application of WES has also been demonstrated in the case of hereditary pheochromocytoma, a rare neural crest cell tumor in which germline mutations in MAX have been identified in three unrelated individuals with hereditary pheochromocytoma. The segregation of two MAX gene variants with hereditary pheochromocytoma in families from whom DNA from affected relatives was available further supports their causality [8▪▪].
The application of NGS to cancer diagnostics has proceeded apace. WGS has demonstrated its discovery and diagnostic potential in a patient originally characterized by an ambiguous diagnosis, that is acute myeloid leukemia of unclear subtype [16▪▪]. WGS was performed on the original leukemic bone marrow and from a skin biopsy of the patient identifying a novel insertional translocation on chromosome 17, which generated a pathogenic PML–RARA gene fusion, thereby, confirming a diagnosis of acute promyelocytic leukemia. This molecular diagnosis carried important clinical implications in the treatment and management of the patient. Following the molecular diagnosis, the patient was considered eligible to receive treatment with retinoic acid, which significantly improves the overall prognosis of patients with acute promyelocytic leukemia, and the bone marrow transplantation treatment option was not considered further. The clinical significance was clear as the results of the analysis were used in clinical decision making in relation to the patient's therapy. WGS has also been employed to resolve the genetic basis of a suspected cancer susceptibility syndrome based upon the early onset of several primary tumors . WGS was performed on leukemic and skin cells derived from the patient and succeeded in identifying a novel heterozygous deletion of three exons in the TP53 gene, whereas the intact copy of TP53 had been lost in the leukemic cells due to uniparental disomy. This demonstrated the utility of WGS in a case with unexpected ‘genetic heterogeneity’. Although this did not affect subsequent clinical decision making, revealing the underlying genetic defect had important implications for the subsequent screening of family members. In addition, a ‘comprehensive genomic approach’ comprising of WGS, WES and transcriptome sequencing has also been shown feasible in terms of their clinical utility from both the technical and cost perspectives, and data analysis and interpretation [23▪]. Figure 1 summarizes the workflow of an integrative approach of genomics, transcriptomics and epigenomics data generated from clinical samples in personalized patient management and treatment.
NGS has also been assessed for its applicability as a diagnostic tool to detect known germline mutations for hereditary cancers through a targeted sequencing approach. By leveraging the technological advances in genomic sequence enrichment and NGS, a targeted sequencing test was developed to capture and sequence 21 genes responsible for an inherited risk of breast and ovarian cancers . This NGS-based test was evaluated in 20 women diagnosed with breast or ovarian cancer who harbored a known mutation in one of the genes responsible for an inherited predisposition to these cancers. This study generated promising results, showing that all the known point mutations, small indel mutations (ranging from 1 to 19 bp) and large genomic duplications and deletions (ranging from 160 to 101 013 bp) could be detected in all the samples. The large deletions and duplications were detected using a read-depth strategy and were in complete agreement with the multiple ligation probe assay. More recently, similar success was replicated in hereditary colorectal cancer syndromes through the development of ColoSeq, which was designed to detect all known pathogenic mutations in the seven genes implicated in these cancer syndromes [13▪▪]. Coloseq was tested on 26 samples (23 cancer patients and three colon cancer cell lines) carrying known germline mutations, and successfully identified all the mutations. These pathogenic lesions included nonsense, missense, frameshift, in-frame deletions and splice site mutations, as well as large deletions and duplications. Although these studies demonstrated the clinical diagnostic feasibility using a high-throughput NGS platform, the production of several hundred gigabases of DNA sequence data in a single run might render it less suitable for sequencing a panel of genes from a much smaller number of samples. It is commonplace for a single patient or a small number of samples to be encountered in a clinical diagnostic context, thereby rendering the barcoding or multiplexing of a large number of patient samples impractical on grounds of cost. Thus, the development of several bench-top NGS instruments has more than adequately filled a niche in the NGS market . Table 1 summarizes and compares the technological features of high-throughput and bench-top NGS platforms.
The research and clinical applications of NGS in cancer have been amply demonstrated, but various challenges still remain. Although a variety of different types of genetic aberration have been identified in cancer genomes, the next crucial step (and a major challenge) will be to understand their biological and functional impact on genome structure, gene expression and signaling pathway networks. Such an understanding is essential to develop biomarkers and drugs with clinical robustness. The integrative analysis of different omics datasets is expected to be more informative, and hence ought to provide new and more detailed biological insights, than would be possible using individual datasets. However, challenges in analysis and interpretation remain. Although fascinating research into integrative genome analysis, predictive network models and sophisticated algorithmic approaches is underway, much innovative work will be required to correlate clinical data and phenotypes such as the accurate prediction of therapeutic response, recurrence and survival with underlying comprehensive relationship of mutations, genome dysfunction and deregulated signaling transduction pathways network. Progress in linking genomic discoveries with clinical phenotypes paves the way for the next-generation of biomarkers and drugs. On the contrary, NGS technology is rapidly changing the landscape of genetic testing and will offer new opportunities for molecular diagnostics through the efficient sequencing of panels of specific disease target genes or WGS/WES. However, various issues related to technical, analytical, data interpretation and cost standpoints have to be addressed before the adoption of NGS in a clinical setting. In addition, serious attention should be paid to obtaining fully informed consent from the patients concerned in relation to the genomic tests and the risk of disclosing findings that might be considered incidental to the initial test (i.e., revealing hitherto unknown genetic causes of hereditary disorders). The research and clinical applications of cancer genome sequencing have progressed at an unprecedented pace over the past several years, and this is likely to be accelerated with further developments of high-throughput NGS technologies and robust analytical tools together with the good collection of clinical samples and data.
Conflicts of interest
The authors declare no conflicts of interest.
REFERENCES AND RECOMMENDED READING
Papers of particular interest, published within the annual period of review, have been highlighted as:
- ▪ of special interest
- ▪▪ of outstanding interest
Additional references related to this topic can also be found in the Current World Literature section in this issue (pp. 81–82).
1. Stratton MR, Campbell PJ, Futreal PA. The cancer genome. Nature
2. Stratton MR. Exploring the genomes of cancer cells: progress and promise. Science
3. Bast RC Jr, Hennessy B, Mills GB. The biology of ovarian cancer: new opportunities for translation. Nat Rev Cancer
4. Metzker ML. Sequencing technologies: the next generation. Nat Rev Genet
5. Meyerson M, Gabriel S, Getz G. Advances in understanding cancer genomes through second-generation sequencing. Nat Rev Genet
6▪. Ku CS, Cooper DN, Wu M, et al. Gene discovery in familial cancer syndromes by exome sequencing: prospects for the elucidation of familial colorectal cancer type X. Mod Pathol
This review analyzed and discussed the opportunities harnessing from the NGS technologies in studying the molecular genetics of familial or hereditary cancer syndromes with unknown etiology and the challenges.
7. Jones S, Hruban RH, Kamiyama M, et al. Exomic sequencing identifies PALB2 as a pancreatic cancer susceptibility gene. Science
8▪▪. Comino-Mendez I, Gracia-Aznarez FJ, Schiavi F, et al. Exome sequencing identifies MAX mutations as a cause of hereditary pheochromocytoma. Nat Genet
One of the earliest studies demonstrated the feasibility of whole-exome sequencing to identify novel causal mutations and genes for familial cancer.
9. Yokoyama S, Woods SL, Boyle GM, et al. A novel recurrent mutation in MITF predisposes to familial and sporadic melanoma. Nature
10▪▪. Kuhn E, Wu RC, Guan B, et al.
Identification of molecular pathway aberrations in uterine serous carcinoma by genome-wide analyses. J Natl Cancer Inst 2012; 104:1503–1513.
This study applied whole-exome sequencing on 10 uterine serous carcinomas and identified frequent somatic mutations in TP53, PIK3CA, FBXW7 and PPP2R1A, further supporting that molecular genetic aberrations involving the p53, cyclin E-FBXW7 and PI3K pathways represent major mechanisms in the development of uterine serous carcinoma.
11▪▪. Cancer Genome Atlas Research Network. Integrated genomic analyses of ovarian carcinoma. Nature 2011; 474: 609–615.
This study applied whole-exome sequencing to delineate the somatic mutational profile of high-grade serous ovarian adenocarcinomas. In addition, microarray analyses were also performed to characterize mRNA expression, miRNA expression, DNA copy number and DNA promoter methylation for 489 tumors (of which 316 samples were subjected to whole-exome sequencing).
12. Stephens PJ, Tarpey PS, Davies H, et al. The landscape of cancer genes and mutational processes in breast cancer. Nature
13▪▪. Pritchard CC, Smith C, Salipante SJ, et al. ColoSeq Provides comprehensive Lynch and polyposis syndrome mutational analysis using massively parallel sequencing. J Mol Diagn
This study demonstrated the feasibility of applying NGS technologies to identify all the known pathological mutations in the clinical samples by sequencing a panel of genes causing hereditary cancer syndromes.
14. Walsh T, Lee MK, Casadei S, et al. Detection of inherited mutations for breast and ovarian cancer using genomic capture and massively parallel sequencing. Proc Natl Acad Sci U S A
15. Walsh T, Casadei S, Lee MK, et al. Mutations in 12 genes for inherited ovarian, fallopian tube, and peritoneal carcinoma identified by massively parallel sequencing. Proc Natl Acad Sci U S A
16▪▪. Welch JS, Westervelt P, Ding L, et al. Use of whole-genome sequencing to diagnose a cryptic fusion oncogene. JAMA
This study applied whole-genome sequencing to identify the genetic cause of a patient originally characterized by an ambiguous diagnosis, that is, acute myeloid leukemia of unclear subtype.
17. Ku CS, Wu M, Cooper DN, et al. Technological advances in DNA sequence enrichment and sequencing for germline genetic diagnosis. Expert Rev Mol Diagn
18. Ku CS, Cooper DN. Exome sequencing: a transient technology for molecular diagnostics? Expert Rev Mol Diagn
19. Wong KM, Hudson TJ, McPherson JD. Unraveling the genetics of cancer: genome sequencing and beyond. Annu Rev Genomics Hum Genet
20. Banerji S, Cibulskis K, Rangel-Escareno C, et al. Sequence analysis of mutations and translocations across breast cancer subtypes. Nature
21. Ellis MJ, Ding L, Shen D, et al. Whole-genome analysis informs breast cancer response to aromatase inhibition. Nature
22. Link DC, Schuettpelz LG, Shen D, et al. Identification of a novel TP53 cancer susceptibility mutation through whole-genome sequencing of a patient with therapy-related AML. JAMA
23▪. Roychowdhury S, Iyer MK, Robinson DR, et al. Personalized oncology through integrative high-throughput sequencing: a pilot study. Sci Transl Med
This study demonstrated the feasibility of a comprehensive genomic approach in clinical diagnostics.
24. Loman NJ, Misra RV, Dallman TJ, et al. Performance comparison of benchtop high-throughput sequencing platforms. Nat Biotechnol