- Question: For what proportion of surgical patients would current guidelines recommend preoperative stress testing, and what direct medical spending would result?
- Findings: Two currently recommended testing tools (Revised Cardiac Risk Index and Myocardial Infarction or Cardiac Arrest) would lead to testing of a large number of nonoverlapping populations, with poor concordance across the 1% risk threshold currently recommended for discriminating between “low” and “elevated” risk patients.
- Meaning: Preoperative stress testing among patients above the 1% risk threshold is likely a much larger source of medical expenditures than testing among low-risk patients and is of uncertain value.
Approximately 1.3% of patients die within the 30 days of major noncardiac surgery.1 Cardiac complications cause considerable morbidity and account for at least one-third of perioperative deaths, as well as prolonging the duration of hospitalization and increasing the cost of health care.2–5 Current guidelines recommend that patients have preoperative assessment of cardiac risk and functional status, and that patients at “elevated” cardiac risk with poor or unknown functional status be referred for preoperative stress testing if such testing would impact management.6
Current guidelines include 3 risk prediction tools for use in risk classification. Depending on the agreement of those tools across the 1% risk threshold used to discriminate between “low” and “elevated” risk patients, they may lead clinicians to refer different patients, and different fractions of the population, for preoperative stress testing. Rates and costs of preoperative stress testing may therefore vary based on the risk prediction tool used and the reliability of the functional status assessment.
Furthermore, the benefit of preoperative stress testing remains unclear. Preoperative stress testing of low-risk patients is clearly a low-value service.7 But stress testing may not offer compelling value for patients at elevated risk either. Perioperative cardiac risk is clearly correlated with probability of obstructive coronary artery disease, but patients are not specifically selected based on pretest probability of coronary artery disease. For patients at low pretest probability, the test characteristics of cardiac stress testing would lead to more false-positive than true-positive results.8,9 Such limited diagnostic accuracy may not be adequate to justify further testing or intervention. Meanwhile, therapeutic interventions that would be undertaken based on preoperative stress testing have not demonstrated benefit in previous trials.10,11 Stress testing does provide prognostic information, but biochemical testing may provide equivalent prognostic information at lower expense.12–14 Until the benefit of stress testing is proven, the value remains uncertain, regardless of predicted risk. A further limitation of preoperative cardiac testing is the implicit assumption that higher risk patients will be managed differently, and that such management will improve outcome. There is little evidence to support either claim.
Our goal was therefore to estimate the expected rates of preoperative stress testing and resultant costs if physicians in the United States were to follow current guidelines. We also compared the rates and costs of preoperative stress testing consequent to 2 risk prediction methods included in current guidelines.
Starting with a cohort of patients representative of the population having surgery in the United States, we predicted cardiac risk using 2 risk prediction tools that are acceptable in current guidelines, and dichotomized risk at the 1% threshold currently recommended. We then imputed functional status and estimated the proportion of surgical patients whose predicted risk exceeds 1% and whose functional status is poor or unknown. To estimate costs, we assumed that modalities of stress testing, and the costs of each, were identical between the preoperative population and the broader US population undergoing stress testing. Based on the rates and direct costs within our cohort stratified by patient age and surgical location, we extrapolated to the US population. Because this study uses previously collected, deidentified data, the requirement for written informed consent was waived by the Cleveland Clinic’s Institutional Review Board. All analyses were performed in Stata (version 14; StataCorp LLC, College Station, TX).
The American College of Surgeons’ National Surgical Quality Improvement Program (ACS-NSQIP) samples patients undergoing surgery at participating hospitals and collects standardized clinical data on preoperative risk factors and postoperative complications.15 NSQIP is the largest and most complete prospective clinical database of surgical patients.15–17 Limited data are available for public use. Hospitals are principally in the United States, with a small number in Canada, the United Kingdom, and Australia. We acquired public use data from the 2009 NSQIP cohort, which included over 336,000 surgical cases from 237 hospitals. We included the entire cohort in our sample because it is not possible to identify which hospitals are located outside the United States. The denominator is cases, and patients may be represented more than once.
Current guidelines designate 3 acceptable risk prediction tools. Two of these, the Revised Cardiac Risk Index (RCRI) and the Myocardial Infarction or Cardiac Arrest (MICA) tool, can be applied to large data sets, while a third tool, the ACS-NSQIP web calculator, can only be applied to 1 patient at a time.16–19 For practical purposes, the ACS-NSQIP was not considered in this analysis.
Both RCRI and MICA rely on anatomical categorizations of surgical site.17,18 We therefore categorized each current procedural terminology (CPT) code included in our cohort according to each tool’s categorization schemes; our categorization is included in Supplemental Digital Content, Appendix, http://links.lww.com/AA/C553. Because they are not included in the relevant American College of Cardiology/American Heart Association guideline, cardiac surgeries were excluded from further analysis. A small number of surgeries (0.27%) were excluded from further analysis due to CPT codes that were either uncategorizable or experimental. After these exclusions, 332,677 cases remained in our cohort.
We then predicted patient-specific risk using RCRI and MICA. All predictor variables required for MICA, other than the above categorization, were available in our cohort (MICA was derived from previous NSQIP public use data). MICA predicts a probability of complications, which we dichotomized at the 1% threshold. In contrast, RCRI is ordinal. RCRI scores of ≥2 imply perioperative cardiac risk >1%.18 Accordingly, patients with an RCRI of ≥2 were considered to be at “elevated” cardiac risk by that scoring system.
The NSQIP data set does not include patients’ metabolic equivalents (METs), which are used to estimate functional status in current guidelines.6 However, NSQIP shares a number of variables with the National Health and Nutrition Examination Survey (NHANES), which measured functional status in METs in a subset of patients.20 We created a logistic regression in NHANES to predict, based on variables common to both NSQIP and NHANES, whether a patient could achieve ≥4 METs. (Those variables included age, weight, diagnoses of congestive heart failure [CHF], coronary artery disease, prior stroke, and diabetes mellitus with or without the use of insulin.) We applied that regression to the NSQIP cohort and dichotomized, thereby imputing whether a patient in our surgical cohort could perform ≥4 METs. Further detail regarding this imputation step is available in Supplemental Digital Content, Appendix, http://links.lww.com/AA/C553.
Observed and Expected Agreement
To estimate agreement between RCRI and MICA beyond that expected by chance, we calculated kappa statistics both for the predicted probability alone and after inclusion of functional status.21 To estimate the precision of our kappa estimates, we calculated 95% confidence intervals (CIs) using a bootstrapping approach. We also calculated a Spearman correlation coefficient for the risk predicted by the 2 tools.
Cost Estimates and Extrapolation
We used previously published literature to estimate the proportions of cardiac stress testing ascribed to different moda lities (exercise electrocardiogram, stress echocardiography, nuclear single-photon emission computed tomography, and computed tomography angiography) and the direct costs of each.22,23 We assumed that the proportions of preoperative stress tests ordered, and the cost of each, mirror the broader US population undergoing stress testing, and that all other infrequent modalities (such as positron emission tomography scanning or cardiac magnetic resonance imaging) had costs equivalent to computed tomography angiography. We did not consider any downstream costs that may result from stress testing, such as coronary angiography, nor did we consider indirect or intangible costs of testing such as loss of time and productivity or emotional costs. We inflated costs to 2017 dollars.
To account for potential demographic differences between the population sampled by NSQIP and the broader US surgical population, we extrapolated to nationwide estimates in 2 stages. First, we extrapolated to the 29 states included in the State Inpatient Databases, stratified by patient age group and location of surgery (inpatient versus ambulatory).24 We then performed a simple extrapolation from the population of those 29 states to the entire US population. Finally, we estimated differences in spending when using different risk prediction systems using a bootstrapping approach.
Our model estimated 18.6 million surgeries among adults annually, of which 52% are ambulatory. The mean patient age was approximately 56, and 58% of surgical patients were female. The data set included 2301 different CPT codes. The most common surgical procedures were laparoscopic appendectomy, laparoscopic cholecystectomy, and hernia repair, gastric bypass, and thromboendarterectomy. The mean surgical patient had a predicted probability of cardiac complications of 0.5% (MICA) and an RCRI of 0.76. Our imputation predicted that 71% of patients could achieve ≥4 METs. Please see Supplemental Digital Content, Appendix, http://links.lww.com/AA/C553, for intermediate results detailing our imputation of functional status.
If the preoperative risk assessment was performed using RCRI, approximately 12.6% of surgical patients would be at “elevated” risk, while risk assessment using MICA would classify approximately 15.1% of surgical patients at “elevated” risk. After assessment of functional status, up to 5.0% of patients assessed with RCRI could be referred for guideline-concordant preoperative stress testing, while up to 6.1% of patients assessed using MICA could be referred for guideline-concordant preoperative stress testing. These results are shown in Figure 1.
Spearman correlation between RCRI and MICA was 0.39 (P < .001). Observed agreement between RCRI and MICA across the 1% risk threshold was 84.3%, compared to expected chance agreement of 76.1% (κ = 0.34; 95% CI, 0.34–0.35). After inclusion of functional status, observed agreement was 94.0%, compared to expected chance agreement of 89.4% (κ = 0.43; 95% CI, 0.42–0.44). Observed agreement between the 2 risk prediction systems is shown in Tables 1 and 2.
The 2 risk prediction systems lead to testing of largely nonoverlapping populations (Table 2). Among patients at elevated risk using RCRI, a majority would be classified as low risk using MICA and vice versa. Meanwhile, over 21% of patients have a predicted risk of ≥1% using at least 1 tool.
If the preoperative risk assessment was performed using RCRI, annual direct costs of preoperative stress testing would total $641 million ($632–$651) if functional status is as carefully assessed as it is in NHANES, and $1.65 billion ($1.64–$1.66) without functional status assessment. Preoperative risk assessments using MICA would result in annual direct costs of $717 million ($709–$726) with careful functional status assessment or $1.80 billion (95% CI, $1.79–$1.81) without functional status assessment.
Compared with RCRI, risk assessments with MICA would result in an additional $76 million in annual spending ($63–$89 million) when functional status is assessed as carefully as in NHANES, or $147 million ($129–$166) more than using RCRI without functional status assessment (means and CIs here use a bootstrapping approach). Rates and estimated costs are summarized in Table 3. Annual expenditures using each scoring system are also shown in Figure 2, with previous estimates of Medicare spending on low-risk patients for comparison.
We estimate that if physicians in the United States followed guideline recommendations for preoperative stress testing, spending on such testing would be considerable. Moreover, the most popular tools used to identify high-risk patients have poor concordance, with 21.7% of patients identified as high risk by either tool, but only 6.0% of patients identified as high risk by both tools. The choice of tool carries medical and financial ramifications because choosing 1 prediction tool over the other would lead to different populations being tested and large differences in spending.
Physicians following the current guidelines could refer an estimated 5.0%–6.1% of patients for preoperative stress testing. Of course, actual rates of testing could be higher or lower than these estimates. For example, patients whose management would not change due to cardiac stress testing are not recommended for preoperative stress testing, even if they were to meet the risk and functional status selection criteria used in our analysis.6
Our rate estimates are broadly in line with previous estimated rates of stress testing. An analysis of Medicare data found an overall rate of preoperative stress testing of 6.4% in 2007, with a marked increase over the previous decade.26 Data from the National Ambulatory Medical Care Survey and National Hospital Ambulatory Medical Care Survey also support increasing rates of preoperative stress testing over time (in spite of guidelines weakening recommendations) while suggesting a lower overall rate of preoperative stress testing (2.0%) compared with analyses of Medicare data.27 Meanwhile, an Ontario-based cohort of patients undergoing intermediate- to high-risk surgeries used administrative data to estimate that 8.9% of patients underwent preoperative stress testing.5 Whether the difference in rates among those cohorts and our predicted guideline-concordant rates are due to differences in patient cohorts, differences in statistical methods, or differences between guidelines and clinical practice is unclear.
No matter the underlying summary rates of testing, our results can offer insight. If the overall rate of preoperative stress testing is 6.4%, and the rate of preoperative stress testing in patients with no indications is 3.75%, rates of testing among patients with indications must be >6.4% (the population average).26 If between 5% and 6% of patients meet the risk and functional status cutoffs for preoperative stress testing (as we estimate), rates of testing among patients with indications would be (of algebraic necessity) between 48% and 57%. Whether stress testing changes management in such a large proportion of operative patients remains to be demonstrated. On the other hand, if the overall rate of preoperative stress testing is 2.0%, and the rate of stress testing in patients with no indications is 3.75%, rates of testing among patients with indications must be <2.0%, suggesting preferential testing of low-risk patients.27
Our data also suggest that annual spending on guideline-concordant preoperative stress tests could exceed $1.7 billion—an order of magnitude larger than estimates of low-value preoperative stress testing.7,25,26 Preoperative stress testing of low-risk patients is already considered a low-value service.7 But as described above, preoperative stress testing may not offer compelling value for patients at elevated risk either. The limited diagnostic accuracy of stress testing in this population may not be adequate to justify further interventions.8,9 And while stress testing offers prognostic information, it may not represent compelling value for purely prognostic use.12 Without convincing evidence that preoperative stress testing improves postoperative outcomes, all spending on preoperative stress testing is of uncertain value.
Our estimates of spending are likely underestimates. First, we have not included indirect or intangible costs. Second, we have not considered rates and costs of downstream testing, such as coronary angiography or coronary bypass surgery. As described above, testing in this population would likely lead to substantial numbers of falsely positive tests and unhelpful downstream testing. And third, we conservatively assumed that modalities of stress testing mirror the rates among the nonsurgical population. If preoperative patients are preferentially referred for more expensive modalities, spending would exceed our estimates. But even if actual spending is in line with our estimates, this work suggests that guideline-concordant spending represents a greater opportunity to improve value than testing of low-risk patients. For comparison, the differences in nationwide spending when the preoperative risk assessment is performed using RCRI or MICA ($76 million in our conservative estimate and $146 million in our more inclusive estimate) are in line with previous estimates of the total Medicare spending on low-risk patients ($80–$181 million).25
Our analysis also suggests that there is poor agreement beyond that expected by chance between 2 tools used to estimate preoperative cardiac risk, which would lead to different populations being referred for stress testing. As shown in Table 2, around half of patients (52%) referred for stress testing using RCRI would also be referred using MICA, while around 42% of patients referred for stress testing using MICA would be referred using RCRI.
Many factors could contribute to discordant predictions. First, RCRI and MICA included different end points in their respective composite outcomes. Both tools included myocardial infarction and cardiac arrest, but RCRI included pulmonary edema and complete heart block in its definition of major cardiac complications, neither of which is included in MICA.17,18 Further, the definitions of myocardial infarction differ between the 2 tools, reflecting the predominant cardiac biomarkers used when each tool was derived. Further, the calibration and discrimination of each tool across a 1% threshold may differ between the 2 tools. If risk of a composite end point is to be used in preoperative decision making, agreement regarding what constitutes a major cardiac event is required, and attention to the tools’ calibration and discrimination across the guideline-recommended threshold is warranted.
Equal attention to the reliability of functional status assessment is warranted. Our estimates assume that whether a patient can perform ≥4 METs is as carefully assessed as it is in NHANES. If the methods used to assess functional status in clinical practice are less reliable than those used in NHANES, substantial differences in testing rates could result, as shown here.
Current guidelines enable considerable spending on preoperative stress testing, the value of which remains unclear. Guideline-recommended spending would differ substantially depending on the risk prediction tool used and the reliability of the functional status assessment. Total spending on preoperative stress testing is likely an order of magnitude greater than Medicare spending on testing of low-risk patients alone, even before considering indirect or intangible costs. Understanding the expected benefit of preoperative stress testing is therefore imperative. Guidelines also recommend use of competing risk assessment tools (RCRI and MICA) that have poor agreement beyond chance across a 1% threshold, with important implications for patient management and national expenditures. In particular, unreliable or limited functional status assessment could lead to large differences in rates and costs of preoperative stress testing. Future work should focus on estimating the reliability of the preoperative functional status assessment and understanding the value of preoperative cardiac stress testing given preoperative cardiac risk. In the interim, guideline committees should examine risk thresholds used for further testing and offer clearer guidance regarding which tool should be used for routine risk assessment.
Name: Matthew A. Pappas, MD, MPH.
Contribution: This author helped conceive the study, collect the data, perform the analysis, and write the initial draft of the manuscript.
Name: Daniel I. Sessler, MD.
Contribution: This author helped improve the analysis and revise the manuscript.
Name: Michael B. Rothberg, MD, MPH.
Contribution: This author helped improve the analysis and revise the manuscript.
This manuscript was handled by: Tong J. Gan, MD.
1. Semel ME, Lipsitz SR, Funk LM, Bader AM, Weiser TG, Gawande AA. Rates and patterns of death after surgery in the United States, 1996 and 2006. Surgery. 2012;151:171–182.
2. Devereaux PJ, Xavier D, Pogue J, et al. POISE (PeriOperative ISchemic Evaluation) Investigators. Characteristics and short-term prognosis of perioperative myocardial infarction in patients undergoing noncardiac surgery: a cohort study. Ann Intern Med. 2011;154:523–528.
3. Udeh BL, Dalton JE, Hata JS, Udeh CI, Sessler DI. Economic trends from 2003 to 2010 for perioperative myocardial infarction: a retrospective, cohort study. Anesthesiology. 2014;121:36–45.
4. van Waes JA, Nathoe HM, de Graaff JC, et al. Cardiac Health After Surgery (CHASE) Investigators. Myocardial injury after noncardiac surgery and its association with short-term mortality. Circulation. 2013;127:2264–2271.
5. Wijeysundera DN, Beattie WS, Austin PC, Hux JE, Laupacis A. Non-invasive cardiac stress testing before elective major non-cardiac surgery: population based cohort study. BMJ. 2010;340:b5526.
6. Fleisher LA, Fleischmann KE, Auerbach AD, et al. American College of Cardiology; American Heart Association. 2014 ACC/AHA guideline on perioperative cardiovascular evaluation and management of patients undergoing noncardiac surgery: a report of the American College of Cardiology/American Heart Association Task Force on practice guidelines. J Am Coll Cardiol. 2014;64:e77–e137.
7. Reed SJ, Pearson S. Choosing Wisely® Recommendation Analysis: Prioritizing Opportunities for Reducing Inappropriate Care. Available at: http://www.choosingwisely.org/wp-content/uploads/2015/05/ICER_Preoperative-Stress-Testing.pdf
. Accessed December 7, 2016.
8. Froelicher VF, Lehmann KG, Thomas R. The electrocardiographic exercise test in a population with reduced workup bias: diagnostic performance, computerized interpretation, and multivariable prediction. Veterans Affairs Cooperative Study in Health Services #016 (QUEXTA) Study Group. Quantitative Exercise Testing and Angiography. Ann Intern Med. 1998;128:965–974.
9. Fihn SD, Blankenship JC, Alexander KP. 2014 ACC/AHA/AATS/PCNA/SCAI/STS focused update of the guideline for the diagnosis and management of patients with stable ischemic heart disease: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines, and the American Association for Thoracic Surgery, Preventive Cardiovascular Nurses Association, Society for Cardiovascular Angiography and Interventions, and Society of Thoracic Surgeons. J Am Coll Cardiol. 2014;64:1929–1949.
10. Wijeysundera DN, Duncan D, Nkonde-Price C. Perioperative beta blockade in noncardiac surgery: a systematic review for the 2014 ACC/AHA guideline on perioperative cardiovascular evaluation and management of patients undergoing noncardiac surgery: a report of the American College of Cardiology/American Heart Association Task Force on practice guidelines. J Am Coll Cardiol. 2014;64:2406–2425.
11. McFalls EO, Ward HB, Moritz TE. Coronary-artery revascularization before elective major vascular surgery. N Engl J Med. 2004;351:2795–2804.
12. Devereaux PJ, Sessler DI. Cardiac complications in patients undergoing major noncardiac surgery. N Engl J Med. 2015;373:2258–2269.
13. Rodseth RN, Biccard BM, Le Manach Y. The prognostic value of pre-operative and post-operative B-type natriuretic peptides in patients undergoing noncardiac surgery: B-type natriuretic peptide and N-terminal fragment of pro-B-type natriuretic peptide: a systematic review and individual patient data meta-analysis. J Am Coll Cardiol. 2014;63:170–180.
14. Nagele P, Brown F, Gage BF. High-sensitivity cardiac troponin T in prediction and diagnosis of myocardial infarction and long-term mortality after noncardiac surgery. Am Heart J. 2013;166:325–332.e1.
15. Cohen ME, Ko CY, Bilimoria KY. Optimizing ACS NSQIP modeling for evaluation of surgical quality and risk: patient risk adjustment, procedure mix adjustment, shrinkage adjustment, and surgical focus. J Am Coll Surg. 2013;217:336–346.e1.
16. Bilimoria KY, Liu Y, Paruch JL. Development and evaluation of the universal ACS NSQIP surgical risk calculator: a decision aid and informed consent tool for patients and surgeons. J Am Coll Surg. 2013;217:833–842.e1.
17. Gupta PK, Gupta H, Sundaram A. Development and validation of a risk calculator for prediction of cardiac risk after surgery. Circulation. 2011;124:381–387.
18. Lee TH, Marcantonio ER, Mangione CM. Derivation and prospective validation of a simple index for prediction of cardiac risk of major noncardiac surgery. Circulation. 1999;100:1043–1049.
19. American College of Surgeons National Surgical Quality Improvement Program. ACS-NSQIP Surgical Risk Calculator. Available at: http://riskcalculator.facs.org/RiskCalculator/
. Accessed June 1, 2017.
20. Centers for Disease Control and Prevention (CDC). National Center for Health Statistics (NCHS). National Health and Nutrition Examination Survey Data. 2011–2012. Hyattsville, MD: US Department of Health and Human Services, Centers for Disease Control and Prevention; Available at: https://wwwn.cdc.gov/nchs/nhanes/continuousnhanes/2011/
. Accessed May 8, 2015.
21. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–174.
22. Kini V, McCarthy FH, Dayoub E. Cardiac stress test trends among US patients younger than 65 years, 2005-2012. JAMA Cardiol. 2016;1:1038–1042.
23. Mark DB, Federspiel JJ, Cowper PA, et al. PROMISE Investigators. Economic outcomes with anatomical versus functional diagnostic testing for coronary artery disease. Ann Intern Med. 2016;165:94–102.
24. HCUP Databases. Healthcare Cost and Utilization Project (HCUP). 2012. Rockville, MD: Agency for Healthcare Research and Quality; Available at: https://www.hcup-us.ahrq.gov/databases.jsp
. Accessed June 7, 2017.
25. Schwartz AL, Landon BE, Elshaug AG, Chernew ME, McWilliams JM. Measuring low-value care in Medicare. 2014;174:1067.
26. Sheffield KM, McAdams PS, Benarroch-Gampel J. Overuse of preoperative cardiac stress testing in Medicare patients undergoing elective noncardiac surgery. Ann Surg. 2013;257:73–80.
27. Sigmund AE, Stevens ER, Blitz JD, Ladapo JA. Use of preoperative testing and physicians’ response to professional society guidance. JAMA Intern Med. 2015;175:1352–1359.