For COEUR, 22 cases (1%) were excluded because these could not be assigned to one of the 5 major histotypes. The initial arbitration (biomarker-assisted review) reclassified the histotype in 72 cases. The most common revisions were original EC to HGSC and vice versa (N=29 and 8, respectively), original HGSC to LGSC and vice versa (N=13 and 3), and original HGSC to CCC and vice versa (N=7 and 1).
Cases in the 2 largest reclassified groups EC to HGSC (N=29) and HGSC to EC (N=8) were subjected to targeted sequencing of genes that are known to be recurrently mutated in HGSC and EC 22,23. Three cases were excluded from our analysis because of a poor DNA read quality. Of the 34 tumor samples with DNA of sufficient quality, there was an average of 980-fold coverage per amplicon, and 92.3% of the amplicons had a median coverage of at least 50-fold. In 6 cases, no mutations were detected, including 2 HGSC with abnormal TP53 IHC. In a previous study, we were able to detect TP53 mutations in all HGSC that showed abnormal TP53 IHC either by reanalysis or resequencing. Therefore, the absence of TP53 mutations with an abnormal TP53 IHC was not used as evidence against HGSC. In 3 cases of reclassified HGSC with either TP53 mutation or BRCA1 mutation, additional mutations were found that are rare in HGSC, but common in other types (RNF43, MSH2, RPL22). Taken together, in these 12 cases, sequencing did not provide sufficient evidence to change the IHC-supported review. The mutational profile in the remaining 25 cases suggested either an HGSC or an EC-like profile. An HGSC-like profile was defined as showing TP53 mutation with or without BRCA1 or 2 mutation in the absence of EC-like mutations. EC-like mutations were defined as one or more mutations of CTNNB1, PIK3CA, ARID1A, KRAS, or PTEN with or without a concurrent TP53 mutation 22,24. Sequencing confirmed reclassification in 20 instances (16 EC to HGSC and 4 EC to HGSC), but refuted the reclassification from EC to HGSC in 5 instances, thus decreasing the number of reclassified diagnoses from original EC to HGSC to N=24 (Fig. 2).
The disease-specific survival of reclassified EC to HGSC was compared with the reference group of EC and HGSC. Although there was a significant difference for disease-specific survival between HGSC and reclassified EC to HGSC (hazard ratio=1.81; 95% confidence interval, 1.10–3.25, P=0.017), the difference was larger between EC and reclassified EC to HGSC (hazard ratio=3.57, 95% confidence interval, 1.89–6.34, P=0.0002) (Fig. 3). The frequency of biomarker expression in reclassified EC to HGSC was similar to HGSC, with the exception of PGR, which with 73% is higher than expected for HGSC. Sixteen of the 24 (66%) reclassified EC to HGSC were Grade 3, 7 cases Grade 2, and 1 case Grade 1. The reclassification decreased the proportion of Grade 3 EC from 29% in the original diagnosis to 20% after reclassification. The 8 cases reclassified from HGSC to EC were either Grade 2 or Grade 3.
After arbitration using a combination of biomarker-assisted review and next-generation sequencing, the histotype was reclassified in 67 (6%) cases (Table 2). Confirmation rates were the highest for CCC (96.9%), followed by MC (96.3%), and HGSC (94.2%), but lower for EC (76.7%) and LGSC (50%); however, there were only N=8 original diagnosed LGSC in the COEUR cohort. Results for AOVT were highly similar [the histotype was reclassified in 18 cases (3%, 2 excluded)]. Although the confirmations rate for EC was higher at 95.6%, reclassification between HGSC and EC remained the highest (12/20, Supplemental Digital Content 2, http://links.lww.com/IJGP/A33).
Development of IHC Algorithms
With the new reclassified histotype as the endpoint, we developed several IHC-based prediction models using different numbers of marker as the input, applying 3 statistical methods. Data from all cohorts were combined. Only cases with complete data for 8 IHC markers (WT1, TP53, CDKN2A, PGR, TFF3, NAPSA, ARID1A, and VIM) were included. Further, cases with missing NAPSA data or duplicates across the cohorts were excluded (Fig. 1). During the study period, available TMA were stained with NAPSA, a CCC marker previously shown to have a high sensitivity and specificity for CCC 25. Compared with HNF1B, NAPSA had a similar sensitivity for CCC (91.9%–90.8%, respectively), but a better specificity, particularly with respect to MC (97.0%–59.7%). Therefore, we replaced HNF1B with NAPSA.
First we used a nominal logistic regression model, which required the 8-marker panel described above. A total of 1641 out of 1762 cases were classified correctly (confusion matrix in Figure, Supplemental Digital Content 3, http://links.lww.com/IJGP/A34). The overall accuracy was 93.1%. This logistic regression model represents a further refinement of COSP: COSPv3. Receiver operator characteristic values by the histotype of COSPv3 are shown in Supplemental Digital Content 5 (http://links.lww.com/IJGP/A36). In comparison, a recursive partitioning model with 8 splits classified 1601/1762 (90.9%) cases correctly, similar to a recursive partitioning model with 6 splits that classified 1600/1762 cases correctly (90.8%, Figure, Supplemental Digital Content 3, http://links.lww.com/IJGP/A34). This model requires a 6-marker panel consisting of WT1, TP53, NAPSA, CDKN2A, PGR, and TFF3. A recursive partitioning model with only 4 splits classified 1559/1762 cases correctly (88.5%). It requires a 4-marker panel consisting of WT1, TP53, NAPSA/Napsin A, and PGR/PR. To minimize cohort selection bias and prevent model overfitting, we also used split-sample validation, which assigned roughly half of the cases to a new training set randomly and the other half of the cases to a new test set using an alternative statistical approach (the CHAID method). CHAID yielded exactly the same decision tree on the basis of the 4-marker panel consisting of WT1, TP53, NAPSA, and PGR (Fig. 4), with an overall accuracy of 88.5%. The overall accuracy for the new test set was 87.2% (Supplemental Digital Content 6, http://links.lww.com/IJGP/A37). Using this 4-marker panel, Figure 4 shows that IHC has a high specificity or sensitivity in certain scenarios. For example, a combination of WT1 expression with abnormal p53 staining is highly specific for HGSC, with only 6 cases with other histotype diagnoses showing this profile, as seen in Node 6 in Figure 4. A combination of WT1 expression with wild-type p53 staining is sensitive for LGSC with only 1 LGSC showing an IHC profile outside Node 5 in Figure 4. However, LGSC barely represents the majority of the cases in Node 5, and is therefore not very specific.
Because of the difference in histotypes in the stage distribution, the pretest likelihood of a certain histotype differs if the stage is known. We performed additional recursive partitioning stratified by Stages I/II against III/IV using a 6-marker input. The result for Stages III/IV is identical with that for all stages. In Stage I/II, NAPSA becomes the first split, followed by WT1 in the NAPSA-negative arm and PR in the NAPSA-positive arm (Figure, Supplemental Digital Content 3, http://links.lww.com/IJGP/A34).
The Biomarker Expression Frequency
The frequency of biomarker expression by the reclassified histotype is shown in Table 3. Because of analytic and cohort selection differences between COSPv2a-generating and COEUR and AOVT cohorts, we show the marker expression comparing both as well as for all cases combined. After adjusting for multiple testing, TP53, PGR, and VIM show significantly different frequencies within HGSC and VIM within EC, comparing the COSPv2a-generating and the COEUR and AOVT cohorts. As HGSC represents the group with the largest case numbers, smaller P-values can be expected and absolute differences are more important. The PGR expression for HGSC was higher by 9% in the COEUR and AOVT cohort compared with the COSPv2a-generating cohort, as was 5% for TP53. For 2 of the 4 markers, which were stained on different platforms, we performed a head-to-head platform comparison in a subset of cases. Both platforms performed similarly for PGR and TP53 (Table, Supplemental Digital Content 7, http://links.lww.com/IJGP/A38), suggesting that the difference for PGR and TP53 staining between the COSPv2a-generating and the COEUR and AOVT cohorts is a reflection of preanalytic factors intrinsic to the cohorts rather than analytic factors related to the IHC platform. The most problematic biomarker is VIM, showing a variation of 8% in HGSC to 22% in EC across cohorts, which compares unfavorably with CDKN2A, which shows a 1% and 3% difference in HGSC and EC, respectively.
In previous studies, we identified the most useful IHC markers (COSP, COSPv2) for histologic typing 15,16; however, the requirement for up to 10 IHC markers, some of which used antibodies that were not widely available, precluded their implementation in daily pathology practice. Here, we present statistically robustly validated IHC algorithms for typing of ovarian carcinomas in the form of a hierarchical decision tree, which is more relevant to human decision making. Most pathology laboratories have the minimal 4-marker IHC marker panel (WT1, TP53, NAPSA, and PGR) available. Alternatively, we also provide a new COSP prediction formula, which is a refined version 3 (COSPv3), and is now based on 8 IHC markers (WT1, TP53, CDKN2A, NAPSA, PGR, TFF3, ARID1A, VIM), which is also more feasible for cohort reclassification for research purposes.
The overall accuracy of the IHC-based classification ranges from 87% to 93% depending on the number of input markers (4–8, respectively) and the statistical model. A minimal panel of 4 markers already achieves 87% accuracy. A 6-marker panel increased the accuracy to 91% due to deeper splits, including CDKN2A and TFF3 to increase the identification of LGSC and MC, respectively. With an 8-marker panel and the nominal logistic regression model, there is only a slight increase by another 2%. Although the morphology and the expected IHC profile are concordant in the majority of the ovarian carcinomas, there is a subset of 7% to 13% cases where this is not the case. This discrepancy can be caused by an IHC assay error, a morphology error, or a true aberrant IHC phenotype. IHC assay error can occur due to the use of TMAs, limiting the sensitivity when the antigen is expressed only focally. Although the use of a full section may increase the sensitivity, this may or may not come at a cost of reduced specificity. For example, the sensitivity of NAPSA for CCC in our TMA study was 92% compared with 100% in the study using full sections. However, the specificity for CC against EC was slightly better on TMAs (92%) compared with that reported from full section (90%), perhaps due to the larger number of cases in this study 25. The 3 most important IHC markers (WT1, TP53, NAPSA) were recently a part of the Canadian Interlaboratory Immunohistochemistry Quality Control run 42 26. Whereas WT1 and NAPSA performed very well with error rates of <1% and 4%, respectively, TP53 showed an error rate of 9%, indicating the need for further optimization across pathology laboratories.
Discrepancy can also result from errors in morphologic assessment, which is subjective. In the first part of the study, we reclassified the histotype using IHC information and then we compared the IHC-reclassified histotype with the IHC profile. There is a danger of circular reasoning. To address this, we subjected the largest reclassified group (EC to HGSC and vice versa) to targeted sequencing. However, in 32% of the cases, sequencing was not used in the final typing due to reasons including poor DNA quality, no mutation detected, or mutation detected but difficult to interpret. This shows the limitations of current targeted sequencing panels for classification purposes if a significant subset remains uninformative. Although the mutational profile supported the reclassification in 80% of the informative cases, it refuted it in 20%, which suggests additional value of the mutational status in cases with a discordant morphology and IHC profile. As additional support for the validity of reclassifying a large group of EC to HGSC, we noticed that the outcome of reclassified EC to HGSC is very different from the reference EC. Although their outcome is worse than the reference EC, the reclassified cases also do not overlap with the reference HGSC. One explanation is that these represent HGSC with a higher PGR expression, which occur in a subset of HGSC associated with a favorable outcome, 27 and this subset could be particularly prone to misclassification as EC. The overall confirmation rate of 92% for COEUR and 96% for AOVT compares favorably with prior reports 16,17. The difference is that the current cases were already contemporarily reviewed with the full slide set available for AOVT and 1 representative slide for COEUR. However, results show that even in reviewed series, IHC algorithms can refine the histotype in 4% to 8% of the cases and that the most challenging scenario remains HGSC against EC, particularly involving Grade 2/3 cases.
There are several considerations for practical purposes. The presence of WT1 and abnormal TP53 expression was detected in 906 out of 987 (91.7%) HGSC. Only 8 of out 912 cases (1%) showing this combination were histotypes other than HGSC, including 5 EC. A diagnosis of a carcinoma other than HGSC in the presence of WT1 positivity and abnormal TP53 expression seems unwise and has to stand on sound morphologic grounds (low-grade nuclear atypia in glandular or villoglandular architecture with squamous or mucinous differentiation) or be supported by other molecular evidence such as EC-like mutations. The WT1 negativity rate in HGSC (3%) has decreased remarkably from 20% before 2008 1, mostly due to the increased sensitivity of IHC. Although almost all CCC and MC remain WT1 negative, WT1 expression in EC has also increased slightly from 4% 1 to 10% currently, which is similar to another study 28. Notably, WT1-positive EC are TP53 wild type and should therefore not be confused with HGSC. Only 5% of the HGSC are TP53 wild type by IHC. However, we have previously shown that 5% of the HGSC with the TP53 wild-type IHC pattern still contain the TP53 mutation in a different series because not all TP53 mutations alter the expression of the protein 29. These tumors are particularly challenging with respect to the differential diagnosis of LGSC, which share the same WT1-positive/TP53 wild-type IHC profile. A diagnosis of LGSC should be rethought if the tumor is WT1 negative or TP53 abnormal. Ninety-two percent of the CCC express NAPSA. Expression can be focal and there may be a small subset of NAPSA-negative CCC. The diagnosis of a NAPSA-negative CCC seems acceptable if the morphology is typical, and other evidence for HGSC (WT1 and TP53) or EC (PR) are lacking. MC usually do not cause problems on morphologic grounds when abundantly sampled and the associated precursor is present. However, there can be architectural and cytologic overlap with EC. PR (similarly, also ER, personal observations, M. Köbel, 2016) and TFF3, if available, discriminate most cases. Notably, MC show abnormal TP53 expression in 50% of the cases and higher-grade, “mucin-poor” MC may be confused with HGSC; however, WT1 (and rarely ER, personal observations) will not be expressed in MC. In summary, cases with unusual IHC profiles are interesting groups to study for classification purposes, and perhaps additional molecular markers will help in these situations.
IHC is a robust adjunct tool for the subclassification of ovarian carcinomas. We observed a relatively narrow expression range across cohorts with some exceptions. These can be attributed to preanalytical issues with a variable tissue quality because the range in tissue age was 32 yr (oldest specimen from 1978) and/or to postanalytical factors (intraobserver reproducibility for VIM). Analytical factors do not play a major role in the differences observed.
The presented IHC algorithms may be of use to practicing pathologists and researchers. An error rate of approximately 10% does not allow the use of IHC as a stand-alone, but supports its continued use as an adjunct in daily practice. In conjunction with morphology, the IHC algorithm developed has the power to improve interobserver reproducibility of histotype diagnosis. It can also be used to reclassify retrospective cohorts. For example, studies on CCC may be required to show that they are WT1 negative and NAPSA positive. This tool could also be considered to help select patients for histotype-specific clinical trials.
The authors thank Jennifer M Koziak for AOVT study management and Mie Konno, Michelle Darago, Faye Chambers, and staff at the Tom Baker Cancer Centre Translational Laboratories for AOVT study tumor block retrieval and TMA construction. They thank Shuhong Liu and Christine Chow for immunohistochemical stains and Taryn Rutherford for study coordination. They also thank the anonymous reviewer for their constructive comments.
1. Köbel M, Kalloger SE, Boyd N, et al.. Ovarian carcinoma subtypes are different diseases: implications for biomarker studies. PLoS Med 2008;5:e232.
2. Köbel M, Kalloger SE, Huntsman DG, et al.. Differences in tumor type in low-stage versus high-stage ovarian carcinomas. Int J Gynecol Pathol 2010;29:203–11.
3. Katsumata N, Yasuda M, Isonishi S, et al.. Long-term results of dose-dense paclitaxel and carboplatin versus conventional paclitaxel and carboplatin for treatment of advanced epithelial ovarian, fallopian tube, or primary peritoneal cancer (JGOG 3016): a randomised, controlled, open-label trial. Lancet Oncol 2013;14:1020–6.
4. Kelemen LE, Köbel M. Mucinous carcinomas of the ovary and colorectum: different organ, same dilemma. Lancet Oncol 2011;12:1071–80.
5. Anglesio MS, Carey MS, Köbel M, et al.. Clear cell carcinoma of the ovary: a report from the first Ovarian Clear Cell Symposium, June 24th, 2010. Gynecol Oncol 2011;121:407–15.
6. Diaz-Padilla I, Malpica AL, Minig L, et al.. Ovarian low-grade serous carcinoma: a comprehensive update. Gynecol Oncol 2012;126:279–85.
7. Köbel M, Kalloger SE, Baker PM, et al.. Diagnosis of ovarian carcinoma cell type is highly reproducible: a transCanadian study. Am J Surg Pathol 2010;34:984–93.
8. Köbel M, Bak J, Bertelsen BI, et al.. Ovarian carcinoma histotype
determination is highly reproducible, and is improved through the use of immunohistochemistry
. Histopathology 2013;64:1004–13.
9. Soslow RA. Histologic subtypes of ovarian carcinoma: an overview. Int J Gynecol Pathol 2008;27:161–74.
10. McCluggage WG. My approach to and thoughts on the typing of ovarian carcinomas. J Clin Pathol 2008;61:152–63.
11. Soslow RA. DNA repair mutations and outcomes in ovarian cancer
—letter. Clin Cancer Res 2015;21:658.
12. Köbel M, Kalloger SE, Carrick J, et al.. A limited panel of immunomarkers can reliably distinguish between clear cell and high-grade serous carcinoma of the ovary. Am J Surg Pathol 2009;33:14–21.
13. Madore J, Ren F, Filali-Mouhim A, et al.. Characterization of the molecular differences between ovarian endometrioid carcinoma and ovarian serous carcinoma. J Pathol 2010;220:392–400.
14. Altman AD, Nelson GS, Ghatage P, et al.. The diagnostic utility of TP53 and CDKN2A to distinguish ovarian high-grade serous carcinoma from low-grade serous ovarian tumors. Mod Pathol 2013;26:1255–63.
15. Kalloger SE, Köbel M, Leung S, et al.. Calculator for ovarian carcinoma subtype prediction. Mod Pathol 2011;24:512–21.
16. Köbel M, Kalloger SE, Lee S, et al.. Biomarker-based ovarian carcinoma typing: a histologic investigation in the ovarian tumor tissue analysis consortium. Cancer Epidemiol Biomarkers Prev 2013;22:1677–86.
17. Kommoss S, Gilks CB, Kommoss F, et al.. Accelerating type-specific ovarian carcinoma research: Calculator for Ovarian Subtype Prediction (COSP) is a reliable high-throughput tool for case review. Histopathology 2013;63:704–12.
18. Hoang LN, Zachara S, Soma A, et al.. Diagnosis of ovarian carcinoma histotype
based on limited sampling: a prospective study comparing cytology, frozen section, and core biopsies to full pathologic examination. Int J Gynecol Pathol 2015;34:517–27.
19. Le Page C, Köbel M, de Ladurantaye M, et al.. Specimen quality evaluation in Canadian biobanks participating in the COEUR repository. Biopreserv Biobank 2013;11:83–93.
20. Kelemen LE, Warren GB, Koziak JM, et al.. Smoking may modify the association between neoadjuvant chemotherapy and survival from ovarian cancer
21. Mackenzie R, Talhouk A, Eshragh S, et al.. Morphologic and molecular characteristics of mixed epithelial ovarian cancers. Am J Surg Pathol 2015;39:1548–57.
22. Bell D, Berchuck A, Birrer M, et al.. Integrated genomic analyses of ovarian carcinoma. Nature 2011;474:609–15.
23. Kandoth C, Schultz N, Cherniack AD, et al.. Integrated genomic characterization of endometrial carcinoma. Nature 2013;497:67–73.
24. McConechy MK, Ding J, Senz J, et al.. Ovarian and endometrial endometrioid carcinomas have distinct CTNNB1 and PTEN mutation profiles. Mod Pathol 2013;27:128–34.
25. Kandalaft PL, Gown AM, Isacson C. The lung-restricted marker napsin A is highly expressed in clear cell carcinomas of the ovary. Am J Clin Pathol 2014;142:830–6.
26. Lee S, Piskorz A, Le Page C, et al.. Calibration and optimization of p53, WT1, and Napsin A immunohistochemistry
ancillary tests for histotyping of ovarian carcinoma: Canadian Immunohistochemistry
Quality Control (CIQC) experience. Int J Gynecol Pathol. Nov 23 2015 [Epub ahead of print].
27. Sieh W, Köbel M, Longacre TA, et al.. Hormone-receptor expression and ovarian cancer
survival: an Ovarian Tumor Tissue Analysis consortium study. Lancet Oncol 2013;14:853–62.
28. Stewart CJR, Brennan BA, Chan T, et al.. WT1 expression in endometrioid ovarian carcinoma with and without associated endometriosis. Pathology 2008;40:592–9.
29. Köbel M, Piskorz A, Li S, et al.. Immunohistochemistry
predicts presence and type of TP53 mutation in high-grade serous carcinoma. [abstract]. In: Proceedings of the 105th Annual Meeting of the American Association for Cancer Research; 2014 Apr 5-9 San Diego, CA. Philadelphia (PA): AACR. Cancer Res 2014;74(suppl). Abstract no. 1535.
Ovarian cancer; Histotype; Immunohistochemistry; Next-generation sequencing
Supplemental Digital Content
©2016International Society of Gynecological Pathologists