Predilection site and risk factor of second primary cancer: A pan-cancer analysis based on the SEER database : Chinese Medical Journal

Secondary Logo

Journal Logo


Predilection site and risk factor of second primary cancer: A pan-cancer analysis based on the SEER database

Xiong, Shan1; Liang, Hengrui1; Liang, Peng1; Cai, Xiuyu2; Li, Caichen1; Zhong, Ran1; Li, Jianfu1; Cheng, Bo1; Zhu, Feng3; Ou, Limin4; Chen, Zisheng1,5; Zhao, Yi1; Deng, Hongsheng1; Chen, Zhuxing1; Liu, Zhichao6; Xie, Zhanhong1; Li, Feng1; He, Jianxing1; Liang, Wenhua1,

Editor(s): Ni, Jing

Author Information
Chinese Medical Journal ():10.1097/CM9.0000000000002681, April 27, 2023. | DOI: 10.1097/CM9.0000000000002681

To the Editor: Over the past few years, the number of cancer survivors has continued to increase, primarily driven by the growing and aging population and the improvements in cancer detection and treatment. More than 16.9 million Americans with prior cancers were alive on January 1, 2019, and the estimated number of cancer survivors was estimated to reach >22.1 million by January 1, 2030.[1] The lifetime risk of individual second primary cancer (SPC) has been confirmed when confronted with such a large amount of cancer survivors.[2] Definition of specific site are shown in supplementary material,

Though previous studies suggested a higher risk of SPC among cancer survivors than the general population, significant unmet needs still exist in providing a precise follow-up suggestion and strategy for individuals. The pathogenesis of SPC is complicated and not thoroughly explained. Previous studies have shown that some carcinogens contribute to the occurrence of specific primary cancer can also increase the risk of SPC, such as a relatively small proportion of second solid cancers are related to late toxic effects of radiotherapy in adult cancer survivors.[3] No previous studies have explored genes or single-nucleotide polymorphism (SNP) associated with the SPC.

However, SPC data are scarce in the real world as it is a rare event and requires long-term follow-up. The Surveillance, Epidemiology, and End Results (SEER) database provides cancer statistics in the U.S. that enable long-term trace of cancer survivors and relatively rare events from 1973. Thus, the primary aim of this study was to depict an incidence heap map of pan-SPC, including predilection sites, risk factors, and onset timings based on the SEER database to formulate a high-quality surveillance program for SPC. The secondary aim was to explore the genetic correlation between primary and secondary cancer.

We extracted the data in the SEER 9 registries (1975–2017) and finally identified patients diagnosed with cancer between 1988 and 2015 who had complete clinical and demographic data using the SEER*Stat software version 8.3.6 (National Cancer Institute, Culverton, US). The definition of SPC was based on established coding rules in the SEER and previous literature.[2,4] The classification of cancer sites was according to the Site Recode of International Classification of Diseases for Oncology, 3rd edition (ICD-O-3), identifying 25 types of cancer.

Cumulative incidence ratio (CIR) and standardized incidence ratio (SIR) were calculated for each type of SPC to depict the incidence heat map. We applied a subdistribution hazards regression model to estimate the cumulative incidence function (CIF) and hazard ratios (HR) to eliminate the bias in the presence of competing risks.[5] Subgroup analysis stratified by primary tumor site was performed. Two-sample Mendelian randomization (MR) was used to estimate the risk of different cancers for lung cancer and available summary data were from the MR-Base platform ([6] Inverse variance weighted (IVW) analysis was carried out to assess the effect of genetically determined factors on lung cancer risk. MR-Egger and weighted median estimator were used for sensitivity analysis. All statistical analyses were conducted using R version 3.6.2 software (Institute for Statistics and Mathematics, Vienna, Austria).

Overall, data on 5,552,170 patients with a Primary diagnosis of cancer were analyzed and 213,388 (3.8%) developed SPCs. The baseline characteristics of the study group were shown in Supplementary Table 1,, and the inclusion and exclusion criteria were shown in Supplementary Figure 1, Figure 1A shows the incidence heat map of 25 types of SPC by using the CIR and SIR, respectively [Supplementary Figure 2 and 3, Table 2 and 3,]. Overall, 8 of the 25 sites of primary cancer exhibited the highest number of second primary lung cancer (SPLC) (CIR:0.09–2.48%). Patients with cancers of the breast, oral and pharynx, lung, colorectal, brain, and other male genital organs had the highest frequency of SPC in each primary site (CIR: 0.20–1.85%). The number of second primary bladder cancer was the highest among patients who developed primary other urinary system cancers (CIR: 9.56%).

Most SIRs of SPC were >1, indicating that the risk of cancer among cancer survivors was higher than that of the general population. Although there were only a minority of second primary pancreas cancers (SPPC), 16 of the 25 sites of primary cancer were most likely to develop SPPC (SIR:13.56–108.74). We observed that other urinary organs had the highest relative risk of developing an SPC in situ among different types of SPC (SIR:723.76), followed by bladder cancer developing an SPC in other urinary organs (SIR:468.29). For cancers of the oral and pharynx, liver, pancreas, other male genital organs, other digestive organs, other urinary system organs, eyes, bone and soft tissue, and brain, the highest SIRs were observed in each primary site. Moreover, subgroup heat maps according to the different sex were available in Supplementary Figure 4 and 5,

The CIF of SPC in the top three and other seven common solid cancer were showed in Supplementary Fgiure 6, Moreover, we calculated the time-dependent number of SPC since primary cancer was diagnosed [Figure 1B]. We can conclude that the largest number of newly diagnosed SPC were prostate cancer, followed by breast and colon cancer. The majority of incident cases occurred within half a year after the diagnosis of primary cancer. However, the prone time of developing SPC in the primary prostate was 1 year and a half.

Figure 1:
(A) Heat map of cumulative incidence and standardized incidence of different sites of SPC by first primary cancer. The X-axis represents the first primary cancer site, and Y-axis represents the SPC site. The size of the circle represents the LN-converted CIR, and the color depth represents the LN-converted SIR. From left to right, top to bottom, the site of cancer were: breast, esophagus, oral and pharynx, lung, liver, kidney, pancreas, small intestine, stomach, colon, bladder, prostate, female genital organ, other male genital organs, gallbladder and other biliary, other digestive organs, other respiratory organs, other urinary organs, lymphoma, leukemia, eye, brain, endocrine, bone and soft tissue, others. (B) The number of SPC diagnosed within every half year after diagnosis of 10 most incident solid cancers. (C) Risk factors of developing an SPC estimated by competing risk model. Grade 1: well differentiated, Grade 2: moderate differentiated, Grade 3: poorly differentiated, Grade 4: undifferentiated, reference: unknown; age, reference: 0–39; race, reference: white; marital, reference: married; node, reference: positive node = 0; size, reference: micro; chemotherapy, reference: no; radiation, reference: no; sex, reference: male. (D) The genetic correlation risk of lung cancer with other cancers in Mendelian randomization and the risk of developing different SPC in lung cancer survivors. CIR: Cumulative incidence ratio; SIR: Standardized incidence ratio; SPC: Second primary cancer.

Based on the competing risk model, radiotherapy (HR = 1.16, 95%CI 1.15–1.17, P <0.01) and surgery (HR = 1.33, 95%CI 1.31–1.34, P <0.01) were significantly associated with the SPC risks of all sites [Figure 1C]. For chemotherapy, it is not a significant factor for SPC risk in the model of all cancer types (HR = 1.00, 95%CI 0.99–1.01, P = 0.98), but a protective factor in five of eight sites of primary cancers among separate analyses. On the contrary, chemotherapy is a significant risk factor for SPC occurrence in bladder and prostate cancer survivors [Supplementary Tables 4,5,]. Moreover, stratified by the age of primary cancer diagnosis, the risk of developing SPC increased with age (except age group >85 years).

No difference was observed in the risk of developing SPC among tumors with or without lymph node metastasis. When stratified by ethnicity, white Americans were at higher risk of developing SPC, whereas African Americans and Asian Americans were less likely to develop SPC (HR = 0.98, 95%CI 0.97–1.00, P = 0.05; HR = 0.88, 95%CI 0.86–0.90, P <0.01). Compared to males, females was less likely to develop SPC (HR = 0.76, 95%CI 0.75–0.77, P <0.01). We found no difference between the group of unmarried and married patients. Interestingly, married breast cancer survivors were at greater risk for SPC compared with unmarried patients (HR = 1.06, 95%CI 1.04–1.08, P <0.01).

To evaluate the inherent correlation among different cancers, we used MR to measure the genetic susceptibility similarity. We separately calculated the shared SNPs of lung cancer and seven other cancers based on 831,488 individuals [Supplementary Figures S7-14 and Table S6,]. Except for stomach cancer, there are several shared SNPs between lung cancer and other 6 cancers, including 13 of breast cancer, 9 of colon and rectum cancer, 3 of prostate cancer, 3 of bladder cancer, 1 of pancreas cancer, and 1 of oral and pharynx cancer. The rs401681 in the CLPTM1L gene was associated with lung cancer, bladder cancer, prostate cancer, and pancreas cancer. Similarly, the rs907611 in the LSP1 gene was associated with lung cancer, bladder cancer, and prostate cancer.

Moreover, we found that oral and pharynx cancer was significantly associated with an increased risk of second primary lung cancer (OR = 1.30, 95%CI 1.04–1.44). Although no statistically significant differences were observed, the other five cancer types were also associated with a higher risk of lung cancer except pancreas cancer [Supplementary Table 7,]. Scatter plots were based on the causality of lung cancer and other seven cancers in MR and the risk of developing SPLC in cancer survivors [Figure 1D]. The risk of developing specific SPLC was positively correlated to the SNPs-predicted Risk according to primary cancer (R2 = 0.62, P <0.05).

Overall, the phenomenon that cancer survivors have a higher risk of developing cancer urges a timely and appropriate screening strategy for them. Clinical monitoring and periodical follow-up in cancer survivors generally aim at the primary site, causing other SPC sites to be easily omitted. The SPC incidence heat maps, risk factors, and onset time play a vital role in guiding the strategy. For example, besides periodic in situ reexaminations, stomach cancer survivors who underwent surgery were recommended a chest low-dose computed tomography scan in case of SPLC occurrence according to the heat maps. Different sites of cancer survivors can have personalized follow-up strategies according to our research. Besides, we analyzed the common SNPs of lung cancer and other seven cancers based on MR, which indicated that the risk of developing specific SPC was positively correlated to the SNPs-predicted risk according to primary cancer. Thus, it suggested that inherent correlation especially gene susceptibility underlies the preference for SPC, which may explain the different predilection sites of SPC cancer in cancer survivors.


This work was supported by grants from the China National Science Foundation (No. 81871893) and the Key Project of Guangzhou Scientific Research Project (No. 201804020030).

Conflicts of interest



1. Miller KD, Nogueira L, Mariotto AB, Rowland JH, Yabroff KR, Alfano CM, et al. Cancer treatment and survivorship statistics, 2019. CA Cancer J Clin 2019;69: 363–385. doi: 10.3322/caac.21565.
2. Sung H, Hyun N, Leach CR, Yabroff KR, Jemal A. Association of first primary cancer with risk of subsequent primary cancer among survivors of adult-onset cancers in the United States. JAMA 2020;324: 2521–2535. doi: 10.1001/jama.2020.23130.
3. Berrington de Gonzalez A, Curtis RE, Kry SF, Gilbert E, Lamart S, Berg CD, et al. Proportion of second cancers attributable to radiotherapy treatment in adults: A cohort study in the US SEER cancer registries. Lancet Oncol 2011;12: 353–360. doi: 10.1016/s1470-2045(11)70061-4.
4. Han SS, Rivera GA, Tammemägi MC, Plevritis SK, Gomez SL, Cheng I, et al. Risk stratification for second primary lung cancer. J Clin Oncol 2017;35: 2893–2899. doi: 10.1200/JCO.2017.72.4203.
5. Kim HT. Cumulative incidence in competing risks data and competing risks regression analysis. Clin Cancer Res 2007;13(2 Pt 1): 559–565. doi: 10.1158/1078-0432.CCR-06-1210.
6. Hemani G, Zheng J, Elsworth B, Wade KH, Haberland V, Baird D, et al. The MR-base platform supports systematic causal inference across the human phenome. Elife 2018;7: e34408. doi: 10.7554/eLife.34408.

Supplemental Digital Content

Copyright © 2023 The Chinese Medical Association, produced by Wolters Kluwer, Inc. under the CC-BY-NC-ND license.