Stroke prevention with oral anticoagulants (OAC) is one of the most important therapeutic pillars in the management of atrial fibrillation (AF). Bleeding is a major concern in patients using OAC, with patients on warfarin having an annual risk of 2%–5% and 0.5%–1.0% for major and fatal bleeding, respectively. Although the bleeding risk is lower with non-vitamin K antagonist oral anticoagulants, it remains an important concern in high-risk patients. Hence, there is a critical need for better balancing benefit and harm by targeting stroke prevention efforts more precisely.
Several stroke risk stratification schemes in AF patients have been developed.[4–8] The CHA2DS2-VASc score, which assigns 1 point when a patient has a history of heart failure, hypertension, diabetes mellitus, vascular disease, is 65 to 74 years old or is female, and 2 points if the patient is 75 years and older, or if the patient has a history of prior stroke/transient ischemic attack, is the most commonly recommended scheme for assessing thromboembolic risk in patients with AF.[9,10] Patients except those classified as low-risk (with a CHA2DS2-VASc score of 0 in men and 1 in women) are all indicated for OAC. In previous study cohorts, the low-risk group based on CHA2DS2-VASc score only accounts for <10% of the AF population. That is, >90% of the AF patients are indicated for anticoagulation therapy, which suggests that the risk stratification scheme is very limited in a clinical sense.
New techniques in data analysis provide the opportunity for increased prediction precision. Based on the China Atrial Fibrillation (China-AF) Registry study, we aimed to find out a higher proportion of patients who can safely avoid unnecessary anticoagulant therapy by developing a risk model using the state-of-the-art machine learning techniques.
This study was conducted in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of Beijing Anzhen Hospital (No. D11110700300000). Written informed consent was obtained from all study participants.
The China-AF Registry is a prospective, multicenter, and ongoing study of AF patients in Beijing, China. The rationale and design of the study were previously published. From August 2011 to December 2017, a total of 23,108 patients were recruited from outpatient clinics and cardiac wards of 31 tertiary and non-tertiary hospitals located in Beijing. In this analysis, we excluded patients who received OAC treatment (n = 15,667) or underwent catheter ablation at baseline (n = 330), and patients without follow-up information (n = 510). Finally, 6601 AF patients were included in this study [Supplementary Figure 1, http://links.lww.com/CM9/A550].
Data collection and follow-up
Data on the patient's demographic characteristics, lifestyle factors, medical history, and treatment were collected at baseline. Each patient was followed up every 6 months at outpatient clinic or by telephone contact. Major adverse events, including death, non-fatal stroke, hospitalization, and major bleeding, were collected at each time of follow-up. As female sex was not considered as a risk factor for stroke by current AF guidelines,[9,10,13] a sexless CHA2DS2-VASc score, abbreviated as CHA2DS2-VA score, was calculated by excluding female sex from the CHA2DS2-VASc score. Person-time was censored at the time of OAC initiation, catheter ablation, first ischemic stroke or systemic embolism, death, or 1 year after enrollment.
The primary outcome was the time to the first occurrence of a thromboembolic event (TE), including ischemic stroke and systemic embolism, whichever came first. The transient ischemic attack was not included in the outcome events because it was notoriously difficult to diagnose. Patient-reported TEs were adjudicated by two independent neurologists separately. Disagreements on the diagnosis were resolved by discussion or a third neurologist.
Continuous variables were expressed as mean ± standard deviation, and categorical variables were expressed as counts and percentages. The methodology of model derivation and validation in our analysis was shown in Supplementary Figure 2, http://links.lww.com/CM9/A550.
A total of 44 variables [Supplementary Table 1, http://links.lww.com/CM9/A550] were included as candidate predictors. The extreme gradient boosting (XGBoost), a state-of-the-art machine learning technique that assembles weak prediction models (typically decision trees) into a stronger classifier, was used to select important features, and the result was validated by ten-fold cross-validation to reduce overfitting. The XGBoost algorithm can handle missing data automatically and estimate the relative contribution of each variable, thereby allowing feature importance ranking and feature selection. We constructed a Cox proportional hazards model based on the selected variables by the XGBoost model. The risk score was derived from coefficients of the three variables in the Cox regression model.
The novel risk score was internally validated using bootstrapping with 1000 replicates. We assessed the model's discrimination ability using the C-statistics (area under the receiver operating characteristic curve) and compared the C-statistics of our model with that of the CHA2DS2-VA score. We also calculated the net reclassification improvement (NRI) based on our risk prediction model as compared with the CHA2DS2-VA score.
This report followed the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis statement. All statistical analyses were performed using R version 3.5.0 (R Foundation for Statistical Computing, Vienna, Austria). All P values were two-tailed, and P value < 0.05 was considered statistically significant.
We included 6601 AF patients who were not on OAC at baseline in this analysis. The baseline characteristics of the patients were shown in Table 1. During the 1-year follow-up, 163 TEs (147 ischemic strokes and 16 systemic embolisms) occurred.
Table 1 -
Demographics and baseline characteristics in the China-AF cohorts (n
||67.12 ± 12.92
||25.11 ± 3.64
||129.15 ± 18.31
|Heart rate, beats/min
||81.42 ± 21.68
|eGFR <60 mL·min−1·1.73 m−2
||6.07 ± 1.92
||2.54 ± 0.85
||4.32 ± 1.04
||1.44 ± 1.04
||135.45 ± 19.66
|Anteroposterior left atrial diameter, mm
||40.60 ± 7.40
|Left ventricular posterior wall, mm
||9.42 ± 1.46
|AF type, persistent or permanent
|Education, completed high school
Data are shown as mean ± standard deviation or n/N (%). AF: Atrial fibrillation; BMI: Body mass index; China-AF: China atrial fibrillation; eGFR: Estimated glomerular filtration rate; FBG: Fasting blood glucose; LDL-C: Low-density lipoprotein cholesterol; LVEF: Left ventricular ejection fraction; SBP: Systolic blood pressure; TC: Total cholesterol; TG: Total triglyceride.
Derivation of the CAS risk score
The ten most important variables measured by the XGBoost importance score were shown in Figure 1. The top three most important variables were prior stroke, age, and history of heart failure or left ventricular ejection fraction (LVEF) <55%. These three variables accounted for 73.1% of the prognostic information provided by all clinical variables. The other variables did not add significant incremental information to the prediction model [Supplementary Figure 3, http://links.lww.com/CM9/A550]. Of note, hypertension, diabetes mellitus, and vascular disease were not among the top ten most important variables, neither were their combinations.
We developed a novel stroke risk model with the three selected variables (congestive heart failure or LVEF <55%, age, and prior stroke, abbreviated as CAS model). According to the coefficients in the Cox regression model, we assigned 1 point for patients with congestive heart failure or LVEF <55%, and age >65 years, and 2 points for those with a prior stroke [Table 2]. The scores corresponding to these three variables were added together to obtain the CAS score to predict the patient's 1-year stroke risk.
Table 2 -
Regression coefficients and derived score for the CAS stroke risk score.
||HR (95% CI)
|Congestive heart failure or left ventricular dysfunction (LVEF <55%)
|Age >65 years
CI: Confidence interval; HR: Hazard ratio; LVEF: Left ventricular ejection fraction.
Validation of the CAS risk score
The CAS risk score of 0 classified 30.8% (2033/6601) of the patients as low-risk group, whereas the CHA2DS2-VA risk score of 0 classified 15.2% (1002/6601) of the patients as low-risk group. The 1-year risk for TEs and the estimated 95% confidence interval (CI) by 1000 bootstrap replicates in patients with CAS risk score of 0 were 0.81% (95% CI: 0.41%–1.19%), as compared with 1.01% (95% CI: 0.36%–1.64%) in patients with CHA2DS2-VA score of 0 [Table 3]. The Kaplan-Meier curves for survival free from TEs by CAS and CHA2DS2-VA risk groups during follow-up were shown in Figure 2.
Table 3 -
Distribution of patients and event rates with 95% CI using bootstrap (n
= 1000) for the CAS and CHA2
-VA scores in the China-AF cohort.
||Proportion in the study population(n = 6601)
||Event rates (95% CI)
AF: Atrial fibrillation; CAS: Congestive heart failure or left ventricular dysfunction, age, and prior stroke; China-AF: China atrial fibrillation; CI: Confidence interval.
Comparison of the CAS and CHA2DS2-VA risk scores
The C-statistic of CAS model was 0.69 (95% CI: 0.65–0.73), significantly higher than that of CHA2DS2-VA score (0.66, 95% CI: 0.62–0.70, Z = 2.01, P = 0.045) [Figure 3]. We defined the CAS score of 0 as low-risk group and CAS score ≥1 as high-risk group. The NRI was 12.2% (95% CI: 8.7%–15.7%) when CHA2DS2-VA score ≥1 was categorized as high-risk group.
When classifying a specific proportion of cases as high-risk patients, the CAS score consistently identified a higher proportion of patients who will actually experience TEs than the CHA2DS2-VA risk score in our cohort [Supplementary Table 2, http://links.lww.com/CM9/A550]. In addition, to prevent a specific proportion of patients who will eventually experience TEs by treating them with OAC therapy, the CAS score consistently classified a lower proportion of patients than the CHA2DS2-VA risk score [Supplementary Table 3, http://links.lww.com/CM9/A550].
Based on a large prospective cohort of anticoagulation-naive Chinese AF patients, we have developed and validated a simplified CAS risk model for predicting TEs in AF patients. The CAS stroke score can be easily implemented in clinical practice, only encompassing three variables (congestive heart failure or LVEF <55%, age >65 years, and prior stroke). It has good discrimination in predicting 1-year TE risk when compared with the guideline-recommended CHA2DS2-VASc score.
Prior stroke, older age, and heart failure were the dominant predictors in our CAS risk model. This is in line with the other stroke prediction schemes.[4–7] Previous studies showed that heart failure or left ventricular dysfunction was a powerful driver of stroke risk even in young AF patients.[17–19] Heart failure is associated with a hypercoagulable state, which facilitates thrombus formation and cerebral embolism.[20,21] Female sex was not an independent risk factor for thromboembolism in our previous report. Hypertension, diabetes mellitus, and vascular disease or their combinations did not add significant incremental information to the risk score. In the CHA2DS2-VASc score, all these factors are assigned one score despite their limited contribution to the risk of stroke. Several prior studies reported that vascular disease was not associated with increased stroke risk.[5,6,22] Other studies reported that blood pressure and glycemic control appeared to be more important than a history of hypertension or diabetes in predicting thromboembolism risk in patients with AF.[23,24] These findings were also supported by studies reporting that well-controlled risk factors were associated with improved clinical outcomes in AF patients. Other clinical risk factors, such as obstructive sleep apnea, may also affect the stroke risk in patients with AF. However, we did not collect the data at baseline.
The advantage of CAS risk score is the ability to identify as high as 30% of patients with true low risk of stroke. By only anticoagulating 70% of patients in the AF population, we can capture 90% of those who will experience thromboembolism if left untreated. The CAS scheme yielded a C-index of 0.69, outperforming the current guideline-recommended CHA2DS2-VASc score in discrimination and stratification. Another advantage of CAS risk scheme is that it clearly separates the AF patients into low-risk and high-risk groups, which facilitates clinical decision making. With a CAS score of 0, the risk of thromboembolism is even lower than those who have a CHA2DS2-VA score of 0 (0.81% [0.41%–1.19%] vs. 1.01% [0.36%–1.64%]). This means a more precise targeting of high-risk patients. The CAS model was derived to predict 1-year stroke risk. Dynamic (annually) evaluation of the AF patients to adequately identify incident stroke risk factors was recommended by current guidelines, as changes of risk factors may have a great impact on the risk of stroke.[27,28]
This study has several limitations. First, the CAS risk prediction scheme was derived and internally validated in a Chinese AF patient cohort. Therefore, external validations with other datasets are warranted to generalize our findings. Nonetheless, the relative simplicity of our model may prevent the risk of over-fitting from external validation. Second, the calibration of our model was not assessed with a split-sample approach due to the limited size of cases. However, we used 1000 bootstrap replicates to estimate the 95% CI of event rate. Finally, we did not incorporate biomarkers, left atrial morphology and function, AF burden, or other clinical factors in our risk prediction model. These factors may be useful for incremental risk prediction, as suggested by other studies.[29–31]
The CAS model outperformed the current widely-used CHA2DS2-VA score, especially in identifying a large proportion of patients with low risk for thromboembolism. The model can be easily applied as a risk stratification tool to inform clinical decision-making on anticoagulant use.
This work was supported by the National Key Research and Development Program of China (Nos. 2017YFC0908803, 2018YFC1312501, and 2020YFC2004803) and a grant from the Beijing Municipal Commission of Science and Technology (No. D171100006817001). The construction of the China Atrial Fibrillation Registry was also supported by grants from Bristol-Myers Squibb, Pfizer, Johnson & Johnson, Boehringer-Ingelheim, and Bayer.
Conflicts of interest
Dr. Jian-Zeng Dong received honoraria from Johnson & Johnson for giving lectures. Dr. Chang-Sheng Ma received honoraria from Bristol-Myers Squibb, Pfizer, Johnson & Johnson, Boehringer-Ingelheim, and Bayer for giving lectures. The other authors report no conflicts of interest.
1. Marini C, De Santis F, Sacco S, Russo T, Olivieri L, Totaro R, et al. Contribution of atrial fibrillation to incidence and outcome of ischemic stroke: results from a population-based study. Stroke
2005; 36:1115–1119. doi: 10.1161/01.STR.0000166053.83476.4a.
2. Lip GYH, Andreotti F, Fauchier L, Huber K, Hylek E, Knight E, et al. Bleeding risk assessment and management in atrial fibrillation patients: a position document from the European Heart Rhythm Association, endorsed by the European Society of Cardiology Working Group on Thrombosis. Europace
2011; 13:723–746. doi: 10.1093/europace/eur126.
3. Chai-Adisaksopha C, Hillis C, Isayama T, Lim W, Iorio A, Crowther M. Mortality outcomes in patients receiving direct oral anticoagulants: a systematic review and meta-analysis of randomized controlled trials. J Thromb Haemost
2015; 13:2012–2020. doi: 10.1111/jth.13139.
4. Gage BF, Waterman AD, Shannon W, Boechler M, Rich MW, Radford MJ. Validation of clinical classification schemes for predicting stroke: results from the National Registry of Atrial Fibrillation. JAMA
2001; 285:2864–2870. doi: 10.1001/jama.285.22.2864.
5. Lip GYH, Nieuwlaat R, Pisters R, Lane DA, Crijns HJGM. Refining clinical risk stratification for predicting stroke and thromboembolism in atrial fibrillation using a novel risk factor-based approach: the euro heart survey on atrial fibrillation. Chest
2010; 137:263–272. doi: 10.1378/chest.09-1584.
6. Singer DE, Chang Y, Borowsky LH, Fang MC, Pomernacki NK, Udaltsova N, et al. A new risk scheme to predict ischemic stroke and other thromboembolism in atrial fibrillation: the ATRIA study stroke risk score. J Am Heart Assoc
2013; 2:e000250doi: 10.1161/JAHA.113.000250.
7. Hijazi Z, Lindback J, Alexander JH, Hanna M, Held C, Hylek EM, et al. The ABC (age, biomarkers, clinical history) stroke risk score: a biomarker-based risk score for predicting stroke in atrial fibrillation. Eur Heart J
2016; 37:1582–1590. doi: 10.1093/eurheartj/ehw054.
8. Fox KAA, Lucas JE, Pieper KS, Bassand JP, Camm AJ, Fitzmaurice DA, et al. Improved risk stratification of patients with atrial fibrillation: an integrated GARFIELD-AF tool for the prediction of mortality, stroke and bleed in patients with and without anticoagulation. BMJ Open
2017; 7:e017157doi: 10.1136/bmjopen-2017-017157.
9. Kirchhof P, Benussi S, Kotecha D, Ahlsson A, Atar D, Casadei B, et al. 2016 ESC guidelines for the management of atrial fibrillation developed in collaboration with EACTS. Eur Heart J
2016; 37:2893–2962. doi: 10.1093/eurheartj/ehw210.
10. January CT, Wann LS, Calkins H, Chen LY, Cigarroa JE, Cleveland JC Jr, et al. 2019 AHA/ACC/HRS focused update of the 2014 AHA/ACC/HRS guideline for the management of patients with atrial fibrillation: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines and the Heart Rhythm Society in Collaboration with the Society of Thoracic Surgeons. Circulation
2019; 140:e125–e151. doi: 10.1161/CIR.0000000000000665.
11. Steinberg BA, Gao H, Shrader P, Pieper K, Thomas L, Camm AJ, et al. International trends in clinical characteristics and oral anticoagulation treatment for patients with atrial fibrillation: results from the GARFIELD-AF, ORBIT-AF I, and ORBIT-AF II registries. Am Heart J
2017; 194:132–140. doi: 10.1016/j.ahj.2017.08.011.
12. Du X, Ma C, Wu J, Li S, Ning M, Tang R, et al. Rationale and design of the Chinese Atrial Fibrillation Registry Study. BMC Cardiovasc Disord
2016; 16:130doi: 10.1186/s12872-016-0308-1.
13. Lan DH, Jiang C, Du X, He L, Guo XY, Zuo S, et al. Female sex as a risk factor for ischemic stroke and systemic embolism in Chinese patients with atrial fibrillation: a report from the China-AF Study. J Am Heart Assoc
2018; 7:e009391doi: 10.1161/JAHA.118.009391.
14. Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016; San Francisco, California, USA: ACM, 785–794.
15. Li X, Sun Z, Du X, Liu H, Hu G, Xie G. Bootstrap-based feature selection to balance model discrimination and predictor significance: a Study of Stroke Prediction in Atrial Fibrillation. AMIA Annu Symp Proc
16. Moons KGM, Altman DG, Reitsma JB, Ioannidis JPA, Macaskill P, Steyerberg EW, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med
2015; 162:W1–W73. doi: 10.7326/M14-0698.
17. Chao TF, Lip GYH, Lin YJ, Chang SL, Lo LW, Hu YF, et al. Age threshold for the use of non-vitamin K antagonist oral anticoagulants for stroke prevention in patients with atrial fibrillation: insights into the optimal assessment of age and incident comorbidities. Eur Heart J
2019; 40:1504–1514. doi: 10.1093/eurheartj/ehy837.
18. Fauchier L, Lecoq C, Clementy N, Bernard A, Angoulvant D, Ivanes F, et al. Oral anticoagulation and the risk of stroke or death in patients with atrial fibrillation and one additional stroke risk factor: the Loire Valley Atrial Fibrillation Project. Chest
2016; 149:960–968. doi: 10.1378/chest.15-1622.
19. Agarwal M, Apostolakis S, Lane DA, Lip GYH. The impact of heart failure and left ventricular dysfunction in predicting stroke, thromboembolism, and mortality in atrial fibrillation patients: a systematic review. Clin Ther
2014; 36:1135–1144. doi: 10.1016/j.clinthera.2014.07.015.
20. Lip GY, Gibbs CR. Does heart failure confer a hypercoagulable state? Virchow's triad revisited. J Am Coll Cardiol
1999; 33:1424–1426. doi: 10.1016/s0735-1097(99)00033-9.
21. Abdul-Rahim AH, Perez AC, Fulton RL, Jhund PS, Latini R, Tognoni G, et al. Risk of stroke in chronic heart failure patients without atrial fibrillation: analysis of the Controlled Rosuvastatin in Multinational Trial Heart Failure (CORONA) and the Gruppo Italiano per lo Studio della Sopravvivenza nell’Insufficienza Cardiaca-Heart Failure (GISSI-HF) Trials. Circulation
2015; 131:1486–1494. doi: 10.1161/CIRCULATIONAHA.114.013760.
22. Hart RG, Pearce LA. Current status of stroke risk stratification in patients with atrial fibrillation. Stroke
2009; 40:2607–2610. doi: 10.1161/STROKEAHA.109.549428.
23. Fangel MV, Nielsen PB, Kristensen JK, Larsen TB, Overvad TF, Lip GYH, et al. Glycemic status and thromboembolic risk in patients with atrial fibrillation and type 2 diabetes mellitus. Circ Arrhythm Electrophysiol
2019; 12:e007030doi: 10.1161/CIRCEP.118.007030.
24. Kodani E, Atarashi H, Inoue H, Okumura K, Yamashita T, Otsuka T, et al. Impact of blood pressure control on thromboembolism and major hemorrhage in patients with nonvalvular atrial fibrillation: a subanalysis of the J-RHYTHM registry. J Am Heart Assoc
2016; 5:e004075doi: 10.1161/JAHA.116.004075.
25. Jiang C, Lan DH, Du X, Geng YP, Chang SS, Zheng D, et al. Prevalence of modifiable risk factors and relation to stroke and death in patients with atrial fibrillation: a report from the China atrial fibrillation registry study. J Cardiovasc Electrophysiol
2019; 30:2759–2766. doi: 10.1111/jce.14231.
26. Yaranov DM, Smyrlis A, Usatii N, Butler A, Petrini JR, Mendez J, et al. Effect of obstructive sleep apnea on frequency of stroke in patients with atrial fibrillation. Am J Cardiol
2015; 115:461–465. doi: 10.1016/j.amjcard.2014.11.027.
27. Chao TF, Lip GYH, Liu CJ, Lin YJ, Chang SL, Lo LW, et al. Relationship of aging and incident comorbidities to stroke risk in patients with atrial fibrillation. J Am Coll Cardiol
2018; 71:122–132. doi: 10.1016/j.jacc.2017.10.085.
28. Yoon M, Yang PS, Jang E, Yu HT, Kim TH, Uhm JS, et al. Dynamic changes of CHA2DS2-VASc
score and the risk of ischaemic stroke in Asian patients with atrial fibrillation: a Nationwide Cohort Study. Thromb Haemost
2018; 118:1296–1304. doi: 10.1055/s-0038-1651482.
29. Killu AM, Granger CB, Gersh BJ. Risk stratification for stroke in atrial fibrillation: a critique. Eur Heart J
2019; 40:1294–1302. doi: 10.1093/eurheartj/ehy731.
30. Kaplan RM, Koehler J, Ziegler PD, Sarkar S, Zweibel S, Passman RS. Stroke risk as a function of atrial fibrillation duration and CHA2DS2-VASc
2019; 140:1639–1646. doi: 10.1161/CIRCULATIONAHA.119.041303.
31. Alkhouli M, Friedman PA. Ischemic stroke risk in patients with nonvalvular atrial fibrillation: JACC review topic of the week. J Am Coll Cardiol
2019; 74:3050–3065. doi: 10.1016/j.jacc.2019.10.040.