Non-metastatic breast cancer describes a group of heterogeneous diseases with varying survival outcomes according to their different clinicopathological features. Physicians select different treatments based on these features. Therefore, accurately predicting overall survival (OS) for early-stage breast cancer patients may facilitate clinical decision-making and escalate or de-escalate treatment plans. Several survival prediction models, such as Adjuvant! Online (www.adjuvantonline.com), CancerMath (www.CancerMath.com) and PREDICT, have been developed and validated (www.predict.nhs.uk). These models were developed based on data from female Western populations. Females from Eastern populations, however, may have different ethnographic features, clinicopathological characteristics, disease profiles, and healthcare environments. Whether the current models are valid in Eastern populations remains unclear. Additionally, China accounts for 12.2% of global breast cancer cases and 9.6% of cancer-related deaths, with more than 1.6 million people diagnosed and 1.2 million people dying from breast cancer every year. Therefore, the development of a prediction model based on a Chinese population is essential.
In this retrospective multicenter study, we collected data from 3 tertiary teaching hospitals in Guangdong Province, China, and developed the Sun Yat-sen (SYS) model, a nomogram for survival prediction. We validated the SYS model and compared it with the CancerMath model.
Subjects and methods
We included eligible non-metastatic breast cancer patients who received breast-conserving surgery or mastectomy at 3 tertiary teaching hospitals (Sun Yat-sen Memorial Hospital [SYSMH], the First People's Hospital of Foshan [FPHF], and Sun Yat-sen University Cancer Center [SYSUCC]) in Guangdong Province, China between January 1, 2009 and December 31, 2011. The eligibility criteria used to screen the patients were as follows:
- Patients with stage I to III breast cancer in accordance with the criteria set by the American Joint Committee on Cancer
- Aged 18 years or above
- Capable of providing responses during follow-up
- Ability to understand the nature of the study and willing to give informed consent
- Metastatic or de novo stage IV breast cancer
- Phyllodes tumor of the breast
- Unknown estrogen receptor (ER) status, human epidermal growth factor receptor 2 (HER2) status, T stage, and lymph node status.
In eligible patients, the following clinicopathological features were collected: tumor T stage (assessed through ultrasound, physical, and pathological examinations), postoperative nodal status, histology type, age, ER status, HER2 status, tumor grade, surgery type, neo-adjuvant chemotherapy, and postoperative chemotherapy. The decision to perform breast-conserving surgery or mastectomy was determined by surgeons based on patient condition and preference, and all specimens were submitted for pathological examinations. ER positivity was defined as >0% positive tumor cells with nuclear staining. For HER2 status, we used the HercepTest method. HER2 was considered positive if the score was 3+ by immunohistochemistry staining or 2+ for HER2 amplification assessed by fluorescence in situ hybridization (FISH). HercepTest scores of 0 and 1, as well as a score of 2 without HER2 amplification via FISH, were considered HER2-negative. This was a retrospective study with de-identified data collected from databases; therefore, ethical approval and patient consent were not required based on our institutional policy.
Descriptive analyses of baseline clinicopathological features were conducted. Continuous variables were reported using the median and range, and categorical variables were reported as percentages. OS was calculated as the interval from the date of surgery to the date of death or the last follow-up. We used univariate Cox proportional hazards regression analysis to screen for predictors. Significant predictors were included in the multivariate Cox proportional hazards regression model. Significant predictors from the multivariate analysis were included in model development. We used the population from SYSMH as the training cohort, and the populations of FPHF and SYSUCC as the validation cohorts. We used Receiver Operating Characteristic curves (R package name: survival ROC) and calibration plots to assess the models’ discriminative ability and accuracy, respectively. Calibration plots were used to measure the agreement between the actual and predicted probabilities. P-values of <.05 were considered statistically significant. Data analyses were performed using Stata version 13.1 (StataCorp, College Station, TX), and the nomogram was developed using R (version 3.2.4; R Foundation for Statistical Computing, Vienna, Austria).
A total of 1844 patients were included in this study. At median follow-ups of 65.9, 68.6, and 66.2 months, the 5-year OS rates were 93.0% (95% confidence interval [CI]: 91.0%–94.5%), 86.7% (95% CI: 82.3%–90.0%), and 91.0% (95% CI: 88.4%–93.1%) in the SYSMH (n = 868), FPHF (n = 316), and SYSUCC (n = 660) cohorts, respectively (Fig. 1).
Clinicopathological characteristics and treatments are summarized in Table 1. The median age of the study population was 49 years (range, 20–91 years). In the training cohort (SYSMH), 52.3% of the patients underwent breast-conserving surgery (BCS), whereas only <10% of the patients underwent BCS in the validation cohorts (FPHF and SYSUCC). In the FPHF cohort, approximately 50% of the patients received neo-adjuvant chemotherapy (NAC), but only approximately 10% of the patients received NAC in the SYSMH and the SYSUCC cohorts. In addition, we observed that T stage, N stage, ER status, and histology were significantly different among the populations.
Univariate and multivariate analyses (Table 2) of the training cohort suggested that age (hazard ratio (HR) = 1.03, 95% CI: 1.01–1.06, P = .002), T4 stage (vs T1, HR = 3.6, 95% CI: 1.45–8.96, P = .006), positive lymph node status (N1 vs N0, HR = 2.56, 95% CI: 1.31–4.99, P = .006; N2 vs. N0, HR = 5.23; 95% CI: 2.26–10.85, P < .001; N3 vs N0, HR = 10.55, 95% CI: 5.17–21.48, P < .001), negative ER status (vs positive ER status, HR = 1.99, 95% CI: 1.17–3.40, P = .011), and HER2 positivity (vs HER2 negativity, HR = 1.77, 95% CI: 1.08–2.89, P = .024) were significantly associated with poor 5-year OS.
A nomogram (SYS model) was developed using the significant predictors (Fig. 2). The area under the curve was 0.815 in the training cohort (SYSMH), and the area under the receiver operating characteristic curves (AUCs) were 0.74 and 0.77 in the FPHF and SYSUCC validation cohorts, respectively (Table 3). In contrast, the AUCs of CancerMath were 0.725 and 0.693in the FPHF and SYSUCC cohorts, respectively. In addition, the calibration plots showed good agreement between the predicted and actual OS rates in the validation cohorts (FPHF and SYSUCC) when using the SYS model and the CancerMath model (Fig. 3). In the FPHF cohort, the performance of our SYS model was slightly better than that of the CancerMath model. Furthermore, 156 patients showed a ≥10% difference in predicted survival between the SYS model and the CancerMath model. We generated calibration plots of the 2 models for these patients and noticed that the SYS model accurately predicted the actual probability, whereas the CancerMath model significantly overestimated the probability (Fig. 4). Compared with patients with close predicted survival (<10% difference), the patients in the FPHF and SYSUCC cohorts were more likely to have significantly higher T and N stages (Table 4).
Breast cancer is the most common cancer among females in the Chinese population, with an age-standardized rate (ASR) of 27 cases per 100,000 (US, 67 per 100,000). GLOBOCAN also reported that breast cancer is the sixth leading fatal cancer in China, with an ASR of 6.2 cases per 100,000 (US, 14 per 100,000). An accurate prediction of OS may facilitate individualization of the clinical decision-making process. Several prediction models (Adjuvant! Online, CancerMath, and PREDICT) have been developed in Western countries to predict survival probability. However, whether these models are valid in the Chinese population remains unclear because the diseases profiles, treatment patterns, medical insurance coverage, and prognosis of breast cancer patients in China are significantly different to those of patients in Western countries; for example, breast cancer patients in China are more often younger[7,8] and hormone receptor-negative, and have more advanced disease stages at diagnosis because of the lack of a national screening program. In addition, some new drugs, such as T-DM1, pertuzumab, and palbociclib, are not imported into China in a timely manner, limiting the treatment options for patients. Breast cancer survival in China has been reported to be 74%, which is significantly lower than that in the United States (84%). Therefore, we need a survival prediction model based on the Chinese population.
In this study, we developed the SYS model to predict the 5-year OS rate among breast cancer patients and validated it externally in 2 different cohorts. We surmised that the SYS model would benefit our daily clinical practice as patients enquire about their estimated survival when first diagnosed with breast cancer. The current TNM staging system cannot provide individualized estimations. The SYS model was designed as an easy-to-use nomogram for physicians and patients. In addition, with the more accurate prediction of OS, physicians may be able to provide more accurate recommendations to breast cancer patients. For example, the value of anthracycline in adjuvant therapy for early-stage breast cancer patients remains controversial. In comparisons of taxane-only and taxane-anthracycline regimens, the West German Study Group (WSG) PlanB trial study and the ABC Trials indicated that anthracycline added to taxane may provide a survival benefit for high-risk patients, although the results of these studies were inconsistent. Therefore, the successful prediction of survival may help determine the tumor burden and the risk of relapse, and guide physicians in selecting appropriate chemotherapy regimens. Furthermore, after surgery, whether different surveillance plans should be implemented for patients with different risks remains unclear. Several studies have shown that an intensive surveillance plan was equivalent to a simplified plan, and the latter plan was adopted in the current NCCN guidelines. Using a survival prediction model, we can identify high-risk patients and recommend a more intense surveillance plan after surgery.
Several survival prediction models for non-metastatic breast cancer patients have been developed based on Western populations. Adjuvant! Online was developed based on data from the SEER registry in the United States and has been externally validated in different countries.[15–17] However, the website has been under maintenance for the past several years, and we consequently cannot validate the model in our population. PREDICT is an online calculator that was developed based on the UK female breast cancer population and was shown to be equivalent to Adjuvant! Online. However, its primary endpoint is breast cancer-specific survival, which is not available in our study. Therefore, our inability to validate PREDICT and compare it with the SYS model represents a limitation of this study. CancerMath is an online calculator used to predict the probability of OS; it was developed mathematically and was tested in the SEER database and a single-center database. In this study, we validated CancerMath in our study population, and further found that the discriminative ability and accuracy of CancerMath were inferior to those of the SYS model. A total of 156 patients exhibited a ≥10% difference in predicted survival between the 2 models, and the SYS model was accurate in these patients, whereas CancerMath significantly overestimated survival. These patients were more likely to have more advanced T stages and N stages than those with a <10% difference in predicted survival. Therefore, we suggest that the SYS model is superior to CancerMath, especially in patients with a higher tumor burden. CancerMath is inferior because it uses tumor size rather than T stage to predict survival. For example, with CancerMath, a patient with a 1-cm tumor and skin involvement (T4) is predicted to have superior survival compared with another patient with a 3-cm tumor without skin involvement (T2). Therefore, we suggest that this is an inherent limitation of the CancerMath model that limits its use in the Chinese population.
Several limitations exist in our study. First, this was a retrospective study. Different institutions may have different protocols for pathological examinations and different preferences for local or systemic therapies and surveillance plans, which may lead to biased data. Second, the patients received systemic therapy primarily in the clinic; consequently, detailed systematic therapy data were unavailable. Different systemic therapies may significantly influence survival, and whether incorporating systemic therapy into the SYS model would improve its performance is unknown. Third, new drugs, such as ado-trastuzumab emtansine (T-DM1), pertuzumab, and palbociclib, which have been used in Western countries for several years, have not been available in China; whether the SYS model would remain valid among Chinese patients with an increasingly favorable prognosis because of the availability of such drugs is uncertain. Fourth, based on a meta-analysis of Early Breast Cancer Trialists’ Collaborative Group (EBCTCG) among patients without chemotherapy or endocrine therapy, ER-positive breast cancer relapse can occur 5 years after surgery. Therefore, a longer follow-up is needed. We may update our SYS model in several years when the follow-up is longer. Finally, we used OS rather than cancer-specific survival, which is unavailable in the databases. This limitation can only be improved by prospectively collecting data in the future.
We developed the SYS model, a nomogram based on a Chinese breast cancer population, to predict 5-year survival probabilities. The new nomogram performed better than CancerMath in this Chinese population.
ZHP, PXC, QL, FTL, LLZ, SRL, and MP contributed to the conception of the study, data acquisition and design of the study. ZHP and KC drafted the article. KC, QL, and FXS revised the paper for important intellectual content. KC, QL, FXS, GLY, MSZ, and EWS provided final approval of the version to be submitted.
This study was supported by the National Natural Science Foundation of China (grant nos. 81402201, 81672619), National Natural Science Foundation of Guangdong Province (grant no. 2014A030310070), and Grant  163 from the Key Laboratory of Malignant Tumor Molecular Mechanism and Translational Medicine of Guangzhou Bureau of Science and Information Technology.
Institutional review board statement
Because this is a retrospective study and we used data from a database, we do not need ethics approval or consent from the patients. This article does not contain any studies with human participants or animals performed by any of the authors.
Conflicts of interest
The authors declared that they have no conflict of interest.
1. Ravdin PM. A computer program to assist in making breast cancer
adjuvant therapy decisions. Semin Oncol
1996; 23 (1 suppl 2):43–50.
2. Chen LL, Nolan ME, Silverstein MJ, et al. The impact of primary tumor size, lymph node status, and other prognostic factors on the risk of cancer death. Cancer
3. Wishart GC, Azzato EM, Greenberg DC, et al. PREDICT: a new UK prognostic model that predicts survival following surgery for invasive breast cancer
. Breast Cancer Res
4. Fan L, Strasser-Weippl K, Li JJ, et al. Breast cancer
in China. Lancet Oncol
5. Jacobs TW, Gown AM, Yaziji H, et al. Specificity of HercepTest in determining HER-2/neu status of breast cancers using the United States Food and Drug Administration-approved scoring system. J Clin Oncol
6. Ferlay J, Soerjomataram I, Dikshit R, et al. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer
7. Fan L, Zheng Y, Yu KD, et al. Breast cancer
in a transitional society over 18 years: trends and present status in Shanghai, China. Breast Cancer Res Treat
8. Li J, Zhang BN, Fan JH, et al. A nation-wide multicenter 10-year (1999–2008) retrospective clinical epidemiological study of female breast cancer
in China. BMC Cancer
9. Zheng S, Bai JQ, Li J, et al. The pathologic characteristics of breast cancer
in China and its shift during 1999-2008: a national-wide multicenter cross-sectional image over 10 years. Int J Cancer
10. Chen W, Zheng R, Baade PD, et al. Cancer statistics in China, 2015. CA Cancer J Clin
11. Siegel RL, Miller KD, Jemal A. Cancer Statistics, 2017. CA Cancer J Clin
12. Nitz U, Gluz O, Christgen M, et al. Reducing chemotherapy use in clinically high-risk, genomically low-risk pN0 and pN1 early breast cancer
patients: five-year data from the prospective, randomised phase 3 West German Study Group (WSG) PlanB trial. Breast Cancer Res Treat
13. Blum JL, Flynn PJ, Yothers G, et al. Anthracyclines in early breast cancer
: the ABC trials-USOR 06-090, NSABP B-46-I/USOR 07132, and NSABP B-49 (NRG Oncology). J Clin Oncol
15. Paridaens RJ, Gelber S, Cole BF, et al. Adjuvant! Online estimation of chemotherapy effectiveness when added to ovarian function suppression plus tamoxifen for premenopausal women with estrogen-receptor-positive breast cancer
. Breast Cancer Res Treat
16. Yao-Lung K, Dar-Ren C, Tsai-Wang C. Accuracy validation of adjuvant! online in Taiwanese breast cancer
patients–a 10-year analysis. BMC Med Inform Decis Mak
17. Schmidt M, Victor A, Bratzel D, et al. Long-term outcome prediction
by clinicopathological risk classification algorithms in node-negative breast cancer
–comparison between Adjuvant!, St Gallen, and a novel risk algorithm used in the prospective randomized Node-Negative-Breast Cancer
-3 (NNBC-3) trial. Ann Oncol
18. Engelhardt EG, van den Broek AJ, Linn SC, et al. Accuracy of the online prognostication tools PREDICT and Adjuvant! for early-stage breast cancer
patients younger than 50 years. Eur J Cancer
19. Michaelson JS, Chen LL, Bush D, et al. Improved web-based calculators for predicting breast carcinoma outcomes. Breast Cancer Res Treat
20. Early Breast Cancer
Trialists’ Collaborative Group (EBCTCG)Effects of chemotherapy and hormonal therapy for early breast cancer
on recurrence and 15-year survival: an overview of the randomised trials. Lancet
Keywords:Copyright © 2018 The Chinese Medical Association. Published by Wolters Kluwer Health, Inc.
breast cancer; nomogram; overall survival; prediction