Aortic stenosis (AS) is the most common heart valvular abnormality in the United States, and with the aging of the United States population, the prevalence will undoubtedly increase. According to the 2006 American College of Cardiology/American Heart Association Practice Guidelines for the Management of Patients with Valvular Heart Disease, aortic valve replacement (AVR) is still the gold standard treatment for critical AS.1
Surgical AVR is a significant operation with considerable risk, as it usually requires a median sternotomy and cardiopulmonary bypass. Several reports have established that increasing age and other comorbid conditions such as heart failure, pulmonary hypertension, peripheral vascular disease, frailty, and renal insufficiency increase the risk of AVR. Traditionally, risk predictor models were used to estimate perioperative mortality in large cohorts undergoing cardiac surgery. Subsequently, risk models were either adapted or created for patients specifically undergoing AVR.2–7 Currently, multiple risk models are available to calculate the operative risk for AVR. The European system for cardiac operative risk evaluation (EuroSCORE) is widely used because it was the first model shown to have good discrimination and calibration.8 However, there is accumulating evidence that EuroSCORE overestimates operative mortality in the highest risk patients.2–5,9
With advancing technology, there are numerous modalities in use and under study for AVR including stentless bioprosthesis, transapical, and transcutaneously implanted valves. Many of these newer modalities select patients who are deemed high risk for conventional AVR operations by virtue of the logistic EuroSCORE system.4,5,9 Given the advancing evidence of the inaccuracy of the EuroSCORE system, it would be inappropriate and poor care to deny a patient optimal treatment if the calculated or perceived risk of surgery is incorrect. Several reports have further stated that the Society of Thoracic Surgeons Predicted Risk of Mortality (STS-PROM) model when compared with EuroSCORE was shown to be superior in accurately estimating perioperative risk, especially among high-risk patients undergoing AVR.6,8,10
The purpose of this study was to compare the ability and performance of four major risk models including the EuroSCORE to predict perioperative and 1-year survival in patients undergoing isolated AVR, in hopes of coming up with an optimal risk calculator.
This study was approved by the Institutional Review Board for Inova Heart and Vascular Institute. A total of 394 consecutive patients who underwent isolated AVR for critical AS from January 1, 2001 to July 1, 2007, at a tertiary care center were analyzed using the STS database. Subjects were derived from a computerized database combining elements of the STS National Adult Cardiac Surgery Database and internal data collected for outcomes management. Data were collected prospectively by independent monitors on the basis of prespecified standardized definitions regarding preoperative clinical characteristic, intraoperative variables, and perioperative complications (www.sts.org).
Vital status was determined from the National Death Index (NDI) and cross referenced with the Social Security Death Index. The NDI is a database of death record information compiled from the 50 states, the District of Columbia, Puerto Rico, and the Virgin Islands. The Social Security Death Index is slightly less sensitive but more specific than the NDI. Follow-up for mortality data was right censored on January 15, 2008.
Each patient had risk scores calculated for all four models. The EuroSCORE risk algorithm was calculated according to the published guidelines (www.EuroSCORE.org). Logistic EuroSCORE collects 17 variables divided into three categories (patient-related factors, cardiac-related factors, and operative factors) and are weighted to arrive at predicted risk. The STS-PROM was calculated for each patient when entered into the STS database. The STS-PROM uses 24 variables for its risk model. The Providence Health System uses 13 risk factors on the basis of 11 variables and were calculated for each individual patient.11 The Ambler's risk score is based on assigning integral points in 13 categories of perioperative variables, which is then converted to a predicted risk.12 Patients were stratified into tertiles according to operative surgical risk calculated by the four risk models. Patients were grouped as low risk (0%–4.9%), medium risk (5.0%–9.9%), and high risk (>10.0%).
Definitions were matched between the STS database and the EuroSCORE model whenever possible. In EuroSCORE, extra cardiac arteriopathy was defined as claudication, carotid occlusion or >50% stenosis, and previous or planned intervention on the abdominal aorta, limb arteries, or carotids. For our study, we defined extra cardiac arteriopathy to consist of peripheral vascular disease including claudication, amputation for arterial insufficiency, aortoiliac occlusive disease reconstruction, peripheral vascular bypass surgery, peripheral vascular angioplasty or stent, documented abdominal aneurysm or repair/stent, documented positive noninvasive testing, or history of cerebrovascular disease. In EuroSCORE, neurologic dysfunction was defined as disease severely affecting ambulation or day-to-day functioning. For our study, we defined neurologic dysfunction as history of cerebrovascular disease. Previous cardiac surgery requiring opening of the pericardium included only those having surgery requiring cardiopulmonary bypass. Those patients in cardiogenic shock were defined as having a critical preoperative state. All patients in our cohort had isolated AVR.8
Major adverse cardiac events were defined as a composite end point of operative death, perioperative myocardial infarction (MI), postoperative intra-aortic balloon pump, or postoperative cardiogenic shock. A perioperative MI within 24 hours was diagnosed by the following criteria: creatine kinase (CK)-MB more than five times normal or new Q waves present in two or more contiguous electrocardiogram (EKG) leads. A perioperative MI after 24 hours postoperation was diagnosed by evolutionary ST-segment elevations, development of new Q waves in two or more contiguous leads, new left bundle branch block, or CK-MB elevations more than three times normal.
Operative mortality included all deaths occurring during the hospitalization in which the operation was performed, even if after 30 days, and included those deaths occurring after the discharge from the hospital but within 30 days of the procedure unless the cause of death was clearly unrelated to the operation. Death from any cause was defined as any death during follow-up. Death from cardiac causes was defined as death from any cardiac cause (eg, lethal arrhythmia, MI, or circulatory failure).
Risk of Mortality Models
Each of the four risk prediction models were scaled from 0% to 100% with 100% indicating a 100% predicted chance of mortality. Patients were then stratified into one of three risk groups on the basis of the predicted operative mortality calculated by each of the four risk models: low risk (0%–4.9%), medium risk (5%–9.9%), and high risk (>10%).
Continuous data are presented as mean ± standard deviation. Categorical data are presented as frequency and percent. Survival estimates among patients discharged alive were generated using the Kaplan-Meier method with statistical comparisons accomplished through the log-rank test. Area under the curve (AUC) statistics was generated to assess discriminatory power of the four models. A P value <0.05 was considered statistically significant.
Of the 394 patients in our cohort, the average age was 66.3 ± 12.5 years, 247 (62.7%) were men, 193 (49%) were in New York Heart Association class III or IV heart failure, 34 (8.6%) had a history of MI, four (1%) had triple vessel coronary artery disease, and one (0.3%) had left main coronary artery disease. In addition, seven (1.8%) patients were supported by a preoperative intra-aortic balloon pump and four (1.0%) underwent surgery emergently; 58 (14.7%) patients were older than 80 years (Table 1).
The 30-day mortality for the cohort was 3.3% (n = 13), whereas the 1-year mortality was 5.8% (n = 23). Significant perioperative complications included 65 (16.5%) patients with atrial fibrillation, 29 (7.4%) with prolonged ventilator support, and 18 (4.6%) with renal failure. Fifty-seven (14.5%) patients were readmitted within 30 days of being discharged (Table 2).
On the basis of predicted operative mortality calculated by each of the four risk models, we stratified patients into risk tertiles: low (0%–4.9%), medium (5%–9.9%), and high (>10%). Table 3 demonstrates actual operative and 1-year mortality on the basis of each model's tertiles. The mean predicted risk of death for low-, intermediate-, and high-risk groups were 2.4% ± 1.1%, 6.9% ± 1.4%, and 15.8% ± 7.6% (P < 0.001) with respect to the STS-PROM model. Observed operative mortality for all three risk groups fell within predicted risk for both STS-PROM and Ambler risk models. However, EuroSCORE overpredicted operative mortality for the medium-risk group (1.2% actual), whereas the Providence risk model underpredicted operative mortality for the medium-risk group (22.2% actual).
Furthermore, using these same tertiles, STS-PROM and Ambler continued to demonstrate good discriminatory ability for 1-year survival. The EuroSCORE's low- (2.1% actual) and medium-risk (3.5% actual) curves are difficult to differentiate at 1 year, while the Providence's medium- (22.2% actual) and high-risk (30.0% actual) groups are difficult to differentiate as well. The Kaplan-Meier analyses for 1-year survival (Fig. 1) found significant differences between the risk tertiles for all four risk models (P values <0.001).
Receiver operating characteristic curve analysis demonstrated that the best performing model for predicting 1-year survival were the Ambler's risk score with AUC of 0.799 followed by STS-PROM's risk score with AUC of 0.782. The Providence's risk score had an AUC of 0.775, whereas the EuroSCORE's risk score had an AUC of 0.706.
After Toumpoulis et al8 asserted in 2005 that EuroSCORE can predict both in-hospital and long-term mortality after heart valve surgery, EuroSCORE became the most widely used risk calculator for surgical AVR. However, the EuroSCORE's accuracy has recently been scrutinized in patients undergoing valve surgery. Grossi et al3 showed that the Logistic EuroSCORE overestimates operative mortality in patients undergoing AVR with an EuroSCORE >7. In older patients, the EuroSCORE has been shown to be inferior to the STS-PROM in prediction of perioperative mortality.5 Numerous other studies also have reported that the EuroSCORE overestimates mortality of patients undergoing conventional AVR.4–6 Several reports have shown that in patients with isolated AVR surgery, the STS-PROM has been superior to that of the EuroSCORE, and consistently, the EuroSCORE overestimates risk, especially in those patients who are deemed high risk.6,9 This is an important point given many alternative procedures for AVR including stentless aortic valves, transapically and percutaneously inserted valve studies tend to select patients who are deemed at a higher risk based on the EuroSCORE. There has not been a report that has compared the STS-PROM alongside the Ambler, Providence, and EuroSCORE for accuracy in estimation of predicted risk of AVR.
Our study shows that the Ambler and STS-PROM risk calculators are superior to the Providence and EuroSCORE risk calculators for surgical AVR for predicting both operative mortality and 1-year mortality. In our tertile analysis, we found that Ambler and STS-PROM most accurately predicted 1-year mortality. On the other hand, the EuroSCORE overpredicted mortality in the medium-risk group, and Providence underpredicted mortality in the medium-risk group.
When we used the same tertile analysis for 1-year mortality, the Ambler and STS-PROM both continued to have good discriminatory ability, whereas the EuroSCORE's low- and medium-risk group were hard to differentiate. Providence's medium- and high-risk groups were difficult to differentiate for 1-year mortality as well. In our receiver operating characteristic curve analysis for 1-year mortality, the Ambler and STS-PROM also proved to have superior AUCs to Providence and the EuroSCORE.
There are multiple reasons that Ambler and STS-PROM have superior predictive value, particularly when compared with the EuroSCORE. The EuroSCORE was originally formulated to predict operative mortality in patients undergoing coronary artery bypass grafting procedures but then later applied to other cardiac procedures such as valve replacements. Among the cohort for the EuroSCORE, only 17% of the patients underwent AVR. Therefore, unlike Ambler, it was not originally formulated specifically for valvular surgical risk assessment. Although Ambler has less factors that it considers in its risk assessment 13 compared with EuroSCORE's 18, these factors were originally formulated for valvular surgery and may be more pertinent. Some factors considered in Ambler but not in the EuroSCORE include body mass index, diabetes, dialysis dependent renal failure, ejection fraction, and previous mitral/aortic valve surgery.13
Meanwhile, the STS-PROM's superiority to EuroSCORE is likely because it was formulated on the basis of a much larger cohort than the EuroSCORE—332,604 patients for STS-PROM1 compared with 13,302 patients for EuroSCORE. In addition, the STS-PROM is based on a more modern cohort. AVR operative techniques have evolved in the past decade, and the EuroSCORE's original cohort was based on mortality data from 1995. Finally, STS-PROM considers more than twice as many variables as the EuroSCORE, 41 to the EuroSCORE's 18. Some variables considered in STS-PROM but not in the EuroSCORE include previous valve surgery, diabetes, and dialysis-dependent renal failure.13
Why is it important to accurately predict risk for patients who need AVR? At present, these patients have two options—surgical AVR, which has long been the gold standard or percutaneous AVR, a new and promising catheter-based procedure. Currently, there are a number of trials involving percutaneous AVR versus surgical AVR. Patients are often selected for these catheter-based and transapically implanted trials by virtue of them being deemed “high risk”. Therefore, it is important to have a well-validated clinical risk predictor model to stratify these patients; otherwise, patients will be improperly enrolled in these trials and will be unnecessarily exposed to the risks of any clinical trial seeking long-term data. Even among patients undergoing newer and more experimental AVR techniques, EuroSCORE has been shown to be overestimating the risk of the procedure.4,5,10
The major limitation of this study is that it is not a prospective, randomized trial, so it is inherently limited by its nonrandomized design. In addition, it is a single-center trial, so the results may not be generalizable to other institutions. Prognostic models need to be reevaluated and recalibrated with respect to location and new patient populations before being able to provide empiric information across many different centers and clinical settings.14 There also may have been selection bias. The highest risk patients may not have been offered surgery and, therefore, would not be included in this analysis. Finally, the cause of death for many of these patients is unknown. There are also factors that STS-PROM and Ambler do not take into consideration, such as frailty, that Sündermann et al7 address. With the aging population, a comprehensive scoring system to assess a patient's biologic age, which takes into account different variables such as nutrition level and functional status, are important. Despite these limitations, it is important to note that our findings are in close agreement with the most recent articles about assessing AVR surgical risk.
In conclusion, our study shows that although all four of these risk calculators have predictive value, the Ambler and STS-PROM risk calculators should preferentially be used over the EuroSCORE and Providence risk calculators when trying to predict both operative mortality and 1-year mortality after AVR surgery. Proper risk stratification is particularly necessary at this time, as many patients are enrolled in randomized clinical trials for percutaneous AVR on the basis of these risk scores.
1. Bonow RO, Carabello BA, Chatterjee K, et al. 2006 ACC/AHA 2006 practice guidelines for the management of patients with valvular heart disease: executive summary. J Am Coll Cardiol
2. Leontyev S, Walther T, Borger MA, et al. Aortic valve replacement in octogenarians: utility of risk stratification with EuroSCORE. Ann Thorac Surg.
3. Grossi EA, Schwartz CF, Yu PJ, et al. High-risk aortic valve replacement: are the outcomes as bad as predicted? Ann Thorac Surg.
4. Osswald BR, Gegouskov V, Badowski-Zyla D, et al. Overestimation of aortic valve replacement risk by EuroSCORE: implications for percutaneous valve replacement. Eur Heart J.
5. Schenk S, Fritzsche D, Atoui R, et al. EuroSCORE-predicted mortality and surgical judgment for interventional aortic valve replacement. J Heart Valve Dis.
6. Dewey TM, Brown D, Ryan WH, et al. Reliability of risk algorithms in predicting early and late operative outcomes in high-risk patients undergoing aortic valve replacement. J Thorac Cardiovas Surg.
7. Sündermann S, Dademasch A, Praetorius J, et al. Comprehensive assessment of frailty for elderly high-risk patients undergoing cardiac surgery. Eur J Cardiothoracic Surg.
8. Toumpoulis IK, Anagnostopoulos CE, Toumpoulis SK, et al. EuroSCORE predicts long-term mortality after heart valve surgery. Ann Thorac Surg.
9. Wendt D, Osswald BR, Kayser K, et al. Society of Thoracic Surgeons Score is superior to the EuroSCORE determining mortality in high risk patients undergoing isolated aortic valve replacement. Ann Thorac Surg.
10. Piazza N, Wenaweser P, van Gameren M, et al. Relationship between the logistic EuroSCORE and the Society of Thoracic Surgeons Predicted Risk of Mortality score in patients implanted with the CoreValve ReValving System—A Bern-Rotterdam Study. Am Heart J.
11. Jin R, Grunkemeier GL, Starr A; Providence Health System Cardiovascular Study Group. Validation and refinement of mortality risk models for heart valve surgery. Ann Thorac Surg.
12. Ambler G, Omar RZ, Royston P, et al. Generic, simple risk stratification model for heart valve surgery. Circulation.
13. Brown ML, Schaff HV, Sarano ME, et al. Is the European system for cardiac operative risk evaluation model valid for estimating the operative risk of patients considered for percutaneous aortic valve replacement? J Thorac Cardiovasc Surg.
14. Altman DG, Royston P. What do we mean by validating a prognostic model? Stat Med.
The objective of this paper was to evaluate the performance of 4 different risk algorithms, The Society of Thoracic Surgeons Predicted Risk of Mortality (STS-PROM), EuroSCORE, Ambler and Providence models, to accurately identifying patients at high risk for aortic valve replacement. It was found that the STS-PROM and Ambler scores were superior to the Providence and EuroSCORE for predicting both operative and one-year mortality. In the analysis, the Ambler and STS-PROM most accurately predicted one-year mortality. The EuroSCORE over predicted mortality in the medium risk group and the Providence risk model under predicted mortality in this group.
While numerous other studies have compared the EuroSCORE to the STS-PROM, this study looked at 4 models. This study has a number of limitations, including the fact that it is a single center, retrospective experience and had relatively few high-risk patients. However, the authors performed a comprehensive analysis and have presented useful information for cardiac surgeons. The ability to accurately predict operative mortality is particularly necessary in current practice as patients are being enrolled in pivotal randomized clinical trials comparing open to transcatheter aortic valve replacement based on these risk scores. This paper adds to a growing body of evidence examining these risk models in patients undergoing aortic valve replacement.