Share this article on:

Feasibility of Report Cards for Measuring Anesthesiologist Quality for Cardiac Surgery

Glance, Laurent G. MD; Hannan, Edward L. PhD; Fleisher, Lee A. MD; Eaton, Michael P. MD; Dutton, Richard P. MD; Lustik, Stewart J. MD; Li, Yue PhD; Dick, Andrew W. PhD

doi: 10.1213/ANE.0000000000001252
Economics, Education, and Policy: Research Report

BACKGROUND: In creating the Merit-Based Incentive Payment System, Congress has mandated pay-for-performance (P4P) for all physicians, including anesthesiologists. There are currently no National Quality Forum–endorsed risk-adjusted outcome metrics for anesthesiologists to use as the basis for P4P.

METHODS: Using clinical data from the New York State Cardiac Surgery Reporting System, we conducted a retrospective observational study of 55,436 patients undergoing cardiac surgery between 2009 and 2012. Hierarchical logistic regression modeling was used to examine the variation in in-hospital mortality or major complications (Q-wave myocardial infarction, renal failure, stroke, and respiratory failure) among anesthesiologists, controlling for patient demographics, severity of disease, comorbidities, and hospital quality.

RESULTS: Although the variation in performance among anesthesiologists was statistically significant (P = 0.025), none of the anesthesiologists in the sample was classified as a high- or low-performance outliers. The contribution of anesthesiologists to outcomes represented 0.51% of the overall variability in patient outcomes (intraclass correlation coefficient [ICC] = 0.0051; 95% confidence interval [CI], 0.002–0.014), whereas the contribution of hospitals to patient outcomes was 2.90% (ICC = 0.029; 95% CI, 0.017–0.050). The anesthesiologist median odds ratio (MOR) was 1.13 (95% CI, 1.08–1.24), suggesting that the variation between anesthesiologist was modest, whereas the hospital MOR was 1.35 (95% CI, 1.25–1.48). In a separate analysis, the contribution of surgeons to overall outcomes represented 1.76% of the overall variability in patient outcomes (ICC = 0.018, 95% CI, 0.010–0.031), and the surgeon MOR was 1.26 (95% CI, 1.19–1.37). Twelve of the surgeons were identified as performance outliers.

CONCLUSIONS: The impact of anesthesiologists on the total variability in cardiac surgical outcomes was probably about one-fourth as large as the surgeons’ contribution. None of the anesthesiologists caring for cardiac surgical patients in New York State over a 3+ year period were identified as performance outliers. The use of a performance metric based on death or major complications for P4P may not be feasible for cardiac anesthesiologists.

From the *Department of Anesthesiology, University of Rochester School of Medicine, Rochester, New York; Department of Health Policy, Management and Behavior, School of Public Health, University at Albany, Albany, New York; Department of Anesthesiology, University of Pennsylvania Health System, Philadelphia, Pennsylvania; §U.S. Anesthesia Partners; Department of Public Health Sciences, University of Rochester School of Medicine, Rochester, New York; and RAND Health, Boston, Massachusetts.

Accepted for publication January 28, 2016.

Funding: This project was supported with funding from the Department of Anesthesiology at the University of Rochester School of Medicine.

The authors declare no conflicts of interest.

Supplemental digital content is available for this article. Direct URL citations appear in the printed text and are provided in the HTML and PDF versions of this article on the journal’s website.

Reprints will not be available from the authors.

Address correspondence to Laurent G. Glance, MD, University of Rochester Medical Center, 601 Elmwood Ave., Box 604, Rochester, NY 14642. Address e-mail to laurent_glance@urmc.rochester.edu.

Performance reporting is at the center of efforts to redesign the health care system to achieve better health care outcomes at lower cost. In theory, performance reporting is a potentially disruptive technology. Public reporting promotes transparency and allows patients and referring physicians to make informed choices. Public reporting also incentivizes hospitals and physicians to improve quality to improve their reputations and increase market share. Physicians and hospitals can use benchmarking information to target deficiencies and measure the impact of local changes in health care delivery.1 Early studies reported substantial reductions in mortality and major complications with outcomes reporting for noncardiac surgery in Veterans Affairs hospitals,2 cardiac surgery in New York State (NYS),3 and cardiac surgery in Northern New England.4 Expansion of the National Surgical Quality Improvement Program (NSQIP) benchmarking initiative to the private sector also led to significant improvements in care and fewer complications after hospitals joined NSQIP.5 However, more recent studies comparing NSQIP to non-NSQIP hospitals suggest that NSQIP participation does not lead to better surgical outcomes.6,7 Outcomes reporting without financial incentives, and without public reporting, may not be sufficient to improve outcomes.

The Affordable Care Act not only mandates public reporting, but it also creates strong financial incentives for high-quality care by linking payments to hospital and physician performance.8,9 The Department of Health and Human Services intends to expand value-based purchasing and link 90% of Medicare payments to performance or value by 2018.10 But, despite the logic of redesigning payment structure to incentivize high-quality care, there is little evidence that pay-for-performance (P4P) improves patient outcomes.11–13 Nevertheless, the Affordable Care Act expanded quality reporting beyond hospitals to physicians by creating Physician Compare14 to allow the public to compare individual physicians. Under the Physician Quality Reporting System (PQRS), physicians currently have the option to participate in a voluntary pay-for-reporting program.15 With the passage of recent legislation to repeal the Sustainable Growth Rate formula, Congress is replacing PQRS with the Merit-Based Incentive Payment System and transitioning physicians to value-based compensation, with performance adjustments of ±5% in 2020, going up to 9% in 2022.16

Until now, physicians were paid to participate in PQRS, and the validity of physician performance measures was not an issue because physician compensation was not linked to performance. CMS created a mechanism that allows physicians to meet the pay-for-reporting requirement under PQRS by submitting data on outcome measures that have not undergone rigorous validation.17 In the future, it is likely that anesthesiologists will need to submit risk-adjusted outcome metrics to fully participate in the Merit-Based Incentive Payment System. At this time, there are no validated risk-adjusted outcome metrics that assess the performance of individual anesthesiologists. According to the National Quality Forum criteria for measure validation, there must be evidence of a gap in performance (i.e., variability in provider performance) before a measure can receive National Quality Forum endorsement.18

In a recent research publication, we did find a significant variation in the anesthesiologist rate of death or major complications among patients undergoing isolated coronary artery bypass graft surgery (CABG) in NYS.19 We have since discovered, however, that our choice of variance estimator led to overestimation of the precision of individual estimates of anesthesiologist performance and the variability across anesthesiologists. Our goal in the current article is to report the variability in performance across anesthesiologists caring for cardiac surgical patients based on our revised analysis using hierarchical modeling, the statistical methodology used by CMS. The results of our analysis will help our physician leadership and policy makers determine the feasibility of creating risk-adjusted outcome metrics based on death and major postoperative complications to use as the basis for merit-based payments for cardiac anesthesiologist.

Back to Top | Article Outline

METHODS

Data Source

This study was based on population-based data from the NYS Cardiac Surgery Reporting System for patients undergoing cardiac surgery (CABG, aortic valve replacement, mitral valve repair and replacement, and tricuspid valve repair and replacement) in NYS between 2009 and 2012 (anesthesiologist identifiers were first available for the second-half of 2009). The database includes comprehensive clinical information on patient demographics; encrypted anesthesiologist, surgeon, and hospital identifiers; preoperative risk factors; and in-hospital mortality and major postoperative complications (stroke; Q-wave myocardial infarction [MI]; deep sternal wound infection; bleeding requiring reoperation; sepsis or endocarditis, gastrointestinal bleeding, perforation, or infarction; renal failure; respiratory failure; unplanned cardiac reoperation; or interventional procedure).20 This database does not include any information on physician (e.g., board certification and fellowship training) or hospital structural variables (teaching status and nurse staffing) and cannot be linked to outside data sets to obtain such information. These clinical data were collected prospectively by clinical data collectors and were submitted to the NYS Department of Health.21 Comprehensive audit mechanisms are in place to ensure the accuracy and validity of the data.21 Hospitals with high reported prevalence of cardiac risk factors compared with the state average (e.g., a hospital reporting a large percentage of patients requiring emergency surgery) were subject to auditing.21 Our study was approved by the IRB at the University of Rochester and by the NYS Department of Health. The requirement for informed consent was waived by the IRB at the University of Rochester.

Back to Top | Article Outline

Study Sample

We identified 55,488 patients who underwent one of the following cardiac surgical procedures: CABG, aortic valve replacement, mitral valve repair or replacement, and tricuspid valve repair or replacement. Two-hundred twenty-three patients were missing hematocrits, ejection fractions, and serum creatinine. Since patients with missing data represented <1% of the patient cohort, these observations were not included in our analysis.6 For cases where the attending anesthesiologist starting the case did not complete the case, responsibility for the case was attributed to the first anesthesiologist. The final study cohort consisted of 55,436 cases from 40 hospitals, 185 surgeons, and 357 anesthesiologists.

Back to Top | Article Outline

Analysis

For our analysis, we defined the occurrence of a composite outcome of in-hospital mortality or major in-hospital complication (Q-wave MI [new Q-waves occurring within 48 hours after surgery], renal failure [the need for temporary or permanent dialysis], stroke [permanent new neurologic deficit], or respiratory failure [patient is intubated for ≥72 hours after surgery]). In our baseline analysis, we examined the variation in anesthesiologist performance on the composite outcome with hierarchical logistic regression. We specified a 3-level random-intercept model in which anesthesiologists and hospitals were specified as random effects. We assumed that anesthesiologists were nested within hospitals. We also assumed that anesthesiologists were randomly assigned to work with surgeons within their hospital and that the correlation between surgeon quality and anesthesiologist quality was small. Our approach for estimating anesthesiologist performance differs from the conventional approach for calculating the risk-adjusted outcome of individual surgeons because the latter does not control for hospital quality22 and therefore assumes that surgeon performance is the principal determinant of patient outcomes (other than severity of disease). Because of the hierarchical structure of the data, we estimated anesthesiologist performance separately if he/she worked at >1 hospital. A priori, we attributed outcomes to the primary attending anesthesiologist who started the case.

A priori, we included patient risk factors thought to be associated with death or major complications. To minimize omitted variable bias, we created a nonparsimonious model for the composite outcome and retained some risk factors that did not achieve statistical significance but were judged to be clinically important. The predictor variables we included were age, sex, race, payer, obesity (class I, body mass index [BMI] 30–34.9; class II, BMI 35–39.9; and class III, BMI ≥ 40), underweight (BMI ≤18.5), severity of disease (ejection fraction, emergency, unstable [requires pharmacologic or mechanical support to maintain blood pressure or cardiac index], congestive heart failure, previous MI, calcified aorta, and previous open heart surgery), and comorbidities (valvular disease, renal failure, hepatic failure, cerebrovascular disease, peripheral vascular disease, chronic obstructive pulmonary disease, diabetes mellitus, stent thrombosis, any previous organ transplant, active endocarditis, hematocrit, surgical procedure, and number of cardiac procedures) (Table 1). Fractional polynomials were used to determine the optimal specification of continuous predictor variables to ensure that the model was linear in the logit.23 The discrimination of the regression model was assessed using the C statistic.

Table 1

Table 1

We used the likelihood ratio test to examine whether anesthesiologists had a significant impact on the composite outcome. By comparing the log likelihood of our baseline model, which included both hospital and anesthesiologist random effects, with a nested model that did not include a random-intercept term for anesthesiologists, we tested whether the variance of the anesthesiologist random-intercept term was significantly different from zero. We quantified the size of the contribution of the anesthesiologists to outcome using the intraclass correlation coefficient (ICC) and the median odds ratio (MOR). The ICC24 is the percentage of the variability in patient outcomes that is caused by differences in anesthesiologist performance, after controlling for patient risk and hospital performance.25 The anesthesiologist MOR is the median of the distribution of the increased risk of death or major complication if identical patients are treated by 2 randomly selected anesthesiologists.24,26 If the MOR is 1, then there is no difference in patient outcomes for patients treated by different anesthesiologists. The greater the MOR, the greater the variation in anesthesiologist performance.

The empirical Bayes estimates of the anesthesiologist random effect were exponentiated to calculate an adjusted odds ratio (AOR) for each anesthesiologist. The anesthesiologist AOR represents the likelihood that patients treated by a specific anesthesiologist die or experience a major complication, controlling for patient risk and hospital effects (i.e., hospital performance). Anesthesiologists whose AOR was significantly >1 (95% confidence interval [CI] did not include 1) were classified as low-performance outliers, whereas anesthesiologists with an AOR significantly <1 were classified as high-performance outliers. We visually represented the variability of anesthesiologist performance using a caterpillar plot.

We performed a secondary analysis to examine the contribution of surgeons to patient outcomes using the same approach outlined above, except that the hierarchical model included surgeon and hospital random effects, instead of anesthesiologists and hospital random effects.

We performed 2 sensitivity analyses. Because hierarchical analysis uses shrinkage estimators, measures of provider performance are conservative and the performance of lower volume providers is shrunk toward the mean more than high-volume providers. In the first sensitivity analysis, we limited the analysis to higher volume anesthesiologists by excluding anesthesiologists whose case volumes were below the mean (132) for anesthesiologist case volumes. The goal of this first sensitivity analysis was to determine whether (1) the measured variability of anesthesiologist performance and (2) the anesthesiologist contribution to patient outcomes was impacted by the effects of shrinkage by repeating the primary analysis while limiting the analytic sample to higher volume anesthesiologists.

In the second sensitivity analysis, we limited our analysis of surgeon performance to 1 year of data (2010) so that the median surgeon case volume in the restricted data set (86) would be similar to the median anesthesiologist case volume in the full data (91) used in the primary analysis. The goal was to examine whether (1) the measured variability of surgeon performance and (2) the surgeon contribution to patient outcomes was impacted by the effects of shrinkage when using smaller surgeon case volumes.

Finally, we also performed an analysis to examine the assumption that surgeon and anesthesiologist performance were not correlated. We used the same approach described in the Analysis section to estimate the risk-adjusted performance for individual surgeons using a hierarchical model with hospital and surgeon random effects. We then used linear regression to estimate the association between anesthesiologist and surgeon performance at the patient level.

Data management and statistical analyses were performed using STATA SE/MP Version 14.0 (StataCorp., College Station, TX). All statistical tests were 2 tailed, and P values <0.05 were considered significant.

Back to Top | Article Outline

RESULTS

Tables 2 and 3 display patient demographics and procedure groups. The most common procedures were CABG surgery, aortic valve replacements, and mitral valve surgery. Thirty-four percent of the patients were women, the median age of the population was 68 years (interquartile range, 60–77 years), and 86% of the patients were Caucasian. Thirty-seven percent of the patients were obese, including 4.9% with morbid obesity. The median left ventricular ejection fraction was 55% (interquartile range, 45%–60%), and 10% of the study sample had a left ventricular ejection fraction of ≤30%. Twenty-one percent of the patients had a history of congestive heart failure within 2 weeks before the procedure, and 14% had an MI within 7 days before the procedure. The overall rate of death or major complications was 7.4%, and the in-hospital mortality rate was 1.9%.

Table 2

Table 2

Table 3

Table 3

In our baseline analysis, in which we controlled for patient characteristics and hospital effects, we found that the variability across anesthesiologists was significant (P = 0.025). The contribution of anesthesiologists to overall outcomes represented 0.51% of the overall variability in patient outcomes (ICC = 0.0051; 95% CI, 0.002–0.014), whereas the contribution of hospitals to patient outcomes was 2.90% (ICC = 0.029; 95% CI, 0.017–0.050) (Table 1). Together, the anesthesiologists and hospitals accounted for 3.40% of the variation in patient outcomes (ICC = 0.034; 95% CI, 0.021–0.055). The anesthesiologist median odds ratio (MOR) was 1.13 (95% CI, 1.08–1.24), suggesting that the variation between anesthesiologist is quite small, whereas the hospital MOR was 1.35 (95% CI, 1.25–1.48). An MOR equal to 1.0 signifies no difference among anesthesiologists. Figure 1A depicts the variation in performances of anesthesiologists, controlling for patient risk and hospital effects. In this caterpillar graph, the point estimates for the AOR of each anesthesiologist, along with a 95% CI is shown, starting on the left with anesthesiologists with the lowest AOR. None of the anesthesiologists in our sample was identified as a performance outlier (Table 4).

Table 4

Table 4

Figure 1

Figure 1

In our secondary analysis, we found that the variability across surgeons was significant (P < 0.0001). The contribution of the surgeon to overall outcomes represented 1.76% of the overall variability in patient outcomes (ICC = 0.018; 95% CI, 0.010–0.031) (Table 1). Compared with surgeons, the impact of anesthesiologists on the total variability in cardiac surgical outcomes (ICC = 0.0051) was probably about one-fourth as large as the contribution of the surgeon (ICC = 0.018). Together, the surgeons and hospitals accounted for 3.86% of the variation in outcomes of hospitals (ICC = 0.039; 95% CI, 0.026–0.057). The surgeon MOR was 1.26 (95% CI, 1.19–1.37), whereas the contribution of hospitals to patient outcomes was 2.11% (ICC = 0.021; 95% CI, 0.010–0.042). Figure 1B depicts the variation in the performance of surgeons, controlling for patient risk and hospital effects. There were 8 low-performance outliers, 4 high-performance outliers, and 229 nonoutliers.

In our additional analysis to examine the assumption that quality of surgeons and anesthesiologists is uncorrelated, we found that the correlation between the risk-adjusted performance of anesthesiologists and surgeons was low (correlation coefficient = 0.096; 95% CI, 0.087–0.104). An increase in 1 for the surgeon AOR (e.g., from 1.5 to 2.5) was associated with a 0.031 (e.g., from 1.50 to 1.53) increase in the anesthesiologist AOR. The association between the performance of anesthesiologists and surgeons is displayed in Figure 2.

Figure 2

Figure 2

Table 5

Table 5

In our first sensitivity analysis, we did not find that the contribution of anesthesiologists to patient outcomes changed when we limited our analysis to high-volume anesthesiologists (Table 5; Supplemental Digital Content 1, http://links.lww.com/AA/B394). The contribution of anesthesiologists to patient outcomes was essentially unchanged. In our second sensitivity analysis in which we limited our analysis of the performance of surgeons to 1 year of data, we found that the contribution of surgeons to patient outcomes, as quantified with the ICC, was unchanged. However, we also found that there were no performance outliers of surgeons using 1 year of data, whereas there were 12 performance outliers using the full data (Table 5; Supplemental Digital Content 2, http://links.lww.com/AA/B395).

Back to Top | Article Outline

DISCUSSION

We examined the variation in the rate of death or major complications across anesthesiologists in this population-based retrospective study of >55,000 patients undergoing cardiac surgery in NYS. We found that the impact of anesthesiologists on the total variability in cardiac surgical outcomes was about one-fourth as large as that of cardiac surgeons. After adjusting for patient risk and hospital performance, 0.51% of the variation in death or major complications was attributable to anesthesiologists. In comparison, surgeons were responsible for 1.76% of the variation in outcomes. We did not identify any high- or low-performance anesthesiologists, even when we limited our analysis to higher volume anesthesiologists. But we also found that no high- or low-performance surgeons were identified when we used only 1 year of data so that surgeon case volumes approximated anesthesiologist case volumes. The absence of high- or low-performance anesthesiologists in this large cohort of moderate-to-high-risk surgical patients suggests that basing anesthesiologist P4P on risk-adjusted outcomes may not be feasible, even for relatively high-volume procedures with frequent incidences of major complications. Our findings also suggest that providing cardiac anesthesiologists with benchmarking information on risk-adjusted outcomes may have limited value for quality improvement.

We do not believe that our findings should be interpreted to mean that there is no clinically important variation in the performances of anesthesiologists. By design, hierarchical modeling is more conservative than nonhierarchical modeling and is less likely to identify true performance outliers (lower sensitivity), while also less likely to falsely classify average providers as performance outliers (higher specificity). We used empirical Bayes analysis to estimate the true performance of anesthesiologists. The empirical Bayes estimator of true performance of a physician is the weighted average of the physician’s observed rate and the mean mortality rate for all patients.27 Because the observed death rate of many cardiac anesthesiologists is not likely to represent their true death rate because of small sample sizes, hierarchical modeling shrinks the performance of low-volume physicians toward the mean. Shrinkage reduces the amount of variation among physicians and reduces the estimated contribution of physicians to patient outcomes (as quantified using the ICC and the MOR). At higher patient volumes, we would expect to see a greater variation among anesthesiologists and a larger estimate of the anesthesiologists’ contribution to outcome.

However, the larger anesthesiologist case volumes needed to more reliably estimate the performance of individual anesthesiologists may only be achieved by including a more heterogeneous group of cardiac and noncardiac surgeries. Such an approach would have several important limitations. First, using a single risk adjustment model for a wide range of surgical procedures has limited face validity because it would assign the same weights to patient risk factors for all procedures. Second, the added statistical power of this approach is questionable because many of the noncardiac surgeries have relatively low rates of death or major complications. Third, physician report cards based on this approach would be meaningless to patients seeking to select an anesthesiologist for a specific surgical procedure. Fourth, these report cards would not contain actionable feedback to inform the quality improvement efforts of individual anesthesiologists.

The findings of this analysis replace our earlier report of a large and significant variation in outcomes across anesthesiologists for isolated CABG surgery in NYS.19 We have withdrawn this article because of a methodological error. In our previous analysis using nonhierarchical modeling, we used a cluster robust variance estimator to estimate unbiased SEs in the presence of patient clustering by anesthesiologists. This resulted in very large underestimation of the variance of the anesthesiologists’ fixed effects and led to incorrect conclusions. In the current analysis, we used hierarchical modeling, which provides appropriate variance estimates in the presence of clustering. We also expanded the study sample to include 2 additional years of data, as well as other higher risk cardiac surgeries, to increase the power of our analysis.

Two earlier studies investigated the impact of anesthesiologists on cardiac surgical outcomes using hierarchical modeling. In a large multicenter study based on >110,000 cardiac surgical patients in the United Kingdom, Papachristofi et al.28 previously reported that surgeons account for 4% of the variation in patient outcomes, whereas anesthesiologists account for 0.25% of patient outcomes. Their study also did not identify any high- or low-performance anesthesiologists. This report confirmed the findings of an earlier smaller single-center study by the same group.29

Back to Top | Article Outline

Limitations

This retrospective analysis was performed using observational data to adjust for differences in patient case mix among anesthesiologists. For obvious reasons, it is not possible to randomly assign patients to different anesthesiologists to examine the variation of outcomes among anesthesiologists. And, even if such a design was possible, it would not control for hospital effects because anesthesiologists cannot be randomly distributed across hospitals. One potential limitation of any observational study is data quality. The data elements in the NYS registry were defined by cardiologists and cardiac surgeons on the State’s Cardiac Advisory Committee. Participating hospitals have clinical data coordinators who are trained by the NYS Department of Health. Data accuracy is ensured by record review and audit procedures.21 Our study was limited to the intraoperative role of anesthesiologists and did not explore the impact of anesthesiologists on the postoperative management of cardiac surgical patients.

We used a composite outcome measure to increase the statistical power of our analysis. By construction, this composite measure assigns equal weight to death and each of the major complications included in the composite outcome. In theory, variation in the composite outcome because of differences in mortality could have been masked by the lack of variation for other more common complications. In practice, the lower incidence of mortality (compared with the composite outcome) would have made it even more difficult to establish significant variation across anesthesiologists if mortality was the only outcome of interest.

Our conclusions depend on the validity of our statistical model. When adverse outcomes are uncommon and case volumes are low, it can be difficult to know whether providers with high mortality rates truly deliver low-quality care or had worse outcomes attributable to chance alone.30 Hierarchical modeling performs reliability adjustment, shrinkage, on estimates of provider performance to minimize the role of chance in the classification of provider outliers. Shrinkage is likely to downwardly bias the amount of variability in anesthesiologists’ performances and the impact of anesthesiologists on patient outcomes. Nonhierarchical modeling, however, is more likely to classify providers with extremes of performance that are attributable to chance alone as low-performance outliers and may thus overestimate the variation among anesthesiologists. Many national benchmarking efforts use hierarchical modeling for risk adjustment: CMS Hospital Compare,31 the American College of Surgeons NSQIP,32 and the Society of Thoracic Surgeons.33

It is possible that our risk adjustment model omitted clinically important risk factors. To avoid the possibility of omitted variable bias, we constructed a nonparsimonious model. However, even if we had left out key risk factors in our model, we would expect to see increases in the variation among anesthesiologists because poor outcomes attributable to patient disease would be attributed to individual anesthesiologists. The comprehensiveness of the NYS database and the good statistical performance of our risk adjustment model argue against this limitation. Finally, the NYS Department of Health cannot ensure the reliability of the complication data. However, it is very unlikely that there would be systemic variation in coding quality among anesthesiologists.

The fact that we could not identify anesthesiologists’ performance outliers in this large cohort of moderate-to-high-risk surgical patients suggest that it may be challenging for CMS to base anesthesiologist P4P on risk-adjusted measures of major complications and mortality. There are several potential alternative approaches that could be used for anesthesiologist P4P. First, CMS could rank anesthesiologists by their risk-adjusted performance into performance quartiles and reward the best-performing quartile with a bonus and penalize the worst-performing quartile. If none of the anesthesiologists is a performance outlier, such an approach will in effect penalize average-performing anesthesiologists. CMS currently uses this approach in the Hospital-Acquired Conditions Reductions Program.34 Second, we could select adverse clinical outcomes that are more common to increase the likelihood of detecting performance outliers of anesthesiologists. The challenge is to find clinically important patient-centered outcomes that are common. The American College of Surgeons NSQIP uses urinary tract infections as one of its outcomes.32 Although this increases statistical power, it is questionable whether urinary tract infections are truly a significant complication. In a similar manner, we could use nausea and vomiting as an anesthesiologist-specific metric. But using this as the basis for measuring and publicly reporting anesthesiologists’ performance is likely to trivialize the role of anesthesiologists in perioperative care. In our opinion, the most rational approach to quality measurement for anesthesiologists, as well as for surgeons, is to promote the use of team-based shared-accountability measures. Instead of measuring the performance of individual physicians, this approach attributes patient outcomes to the team of anesthesiologists, surgeons, and other clinical providers all caring for individual patients. By measuring outcomes at the hospital or facility level, this approach leads to much greater statistical power for identifying performance outliers. It also incentivizes teams of clinicians to work together to improve patient outcomes and avoids the criticism that physician-specific measures unfairly assigns blame for patient outcomes to anesthesiologists or surgeons.

Back to Top | Article Outline

CONCLUSIONS

Using clinical data on >55,000 cardiac surgical patients in NYS over a 3.5-year period, we found that anesthesiologists had a clinically significant impact on outcomes. Although we were not able to identify any high- or low-performance anesthesiologists, we found that the impact of anesthesiologists on the total variability in cardiac surgical outcomes was probably about one-fourth as large as the surgeon contribution. Our inability to identify any anesthesiologist outliers may, in part, have been attributable to insufficient anesthesiologist case volumes. When we examined surgeon performance using only 1 year of data, we also did not find a significant variation in surgeon performance. These findings suggest that using risk-adjusted outcomes for death and major complications in cardiac surgery as the basis for anesthesiologists P4P may not be feasible. Instead of using physician performance metrics based on small sample sizes, CMS should consider using team-based shared-accountability measures that will have much greater power to accurately estimate quality of care.E

Back to Top | Article Outline

DISCLOSURES

Name: Laurent G. Glance, MD.

Contribution: This author helped design the study, conduct the study, analyze the data, and write the manuscript.

Attestation: Laurent G. Glance has seen the original study data, reviewed the analysis of the data, approved the final manuscript, and is the author responsible for archiving the study files.

Name: Edward L. Hannan, PhD.

Contribution: This author helped conduct the study and write the manuscript.

Attestation: Edward L. Hannan reviewed the analysis of the data and approved the final manuscript.

Name: Lee A. Fleisher, MD.

Contribution: This author helped conduct the study and write the manuscript.

Attestation: Lee A. Fleisher reviewed the analysis of the data and approved the final manuscript.

Name: Michael P. Eaton, MD.

Contribution: This author helped conduct the study and write the manuscript.

Attestation: Michael P. Eaton reviewed the analysis of the data and approved the final manuscript.

Name: Richard P. Dutton, MD.

Contribution: This author helped conduct the study and write the manuscript.

Attestation: Richard P. Dutton reviewed the analysis of the data and approved the final manuscript.

Name: Stewart J. Lustik, MD.

Contribution: This author helped conduct the study and write the manuscript.

Attestation: Stewart J. Lustik reviewed the analysis of the data and approved the final manuscript.

Name: Yue Li, PhD.

Contribution: This author helped conduct the study and write the manuscript.

Attestation: Yue Li reviewed the analysis of the data and approved the final manuscript.

Name: Andrew W. Dick, PhD.

Contribution: This author helped design the study, conduct the study, analyze the data, and write the manuscript.

Attestation: Andrew W. Dick has reviewed the analysis of the data and approved the final manuscript.

This manuscript was handled by: Steven L. Shafer, MD.

Back to Top | Article Outline

REFERENCES

1. Ross JS, Bernheim SM, Drye ED. Expanding the frontier of outcomes measurement for public reporting. Circ Cardiovasc Qual Outcomes. 2011;4:11–3
2. Khuri SF, Daley J, Henderson W, Hur K, Demakis J, Aust JB, Chong V, Fabri PJ, Gibbs JO, Grover F, Hammermeister K, Irvin G III, McDonald G, Passaro E Jr, Phillips L, Scamman F, Spencer J, Stremple JF. The Department of Veterans Affairs’ NSQIP: the first national, validated, outcome-based, risk-adjusted, and peer-controlled program for the measurement and enhancement of the quality of surgical care. National VA Surgical Quality Improvement Program. Ann Surg. 1998;228:491–507
3. Hannan EL, Kilburn H Jr, Racz M, Shields E, Chassin MR. Improving the outcomes of coronary artery bypass surgery in New York State. JAMA. 1994;271:761–6
4. O’Connor GT, Plume SK, Olmstead EM, Morton JR, Maloney CT, Nugent WC, Hernandez F Jr, Clough R, Leavitt BJ, Coffin LH, Marrin CA, Wennberg D, Birkmeyer JD, Charlesworth DC, Malenka DJ, Quinton HB, Kasper JF. A regional intervention to improve the hospital mortality associated with coronary artery bypass graft surgery. The Northern New England Cardiovascular Disease Study Group. JAMA. 1996;275:841–6
5. Hall BL, Hamilton BH, Richards K, Bilimoria KY, Cohen ME, Ko CY. Does surgical quality improve in the American College of Surgeons National Surgical Quality Improvement Program: an evaluation of all participating hospitals. Ann Surg. 2009;250:363–76
6. Osborne NH, Nicholas LH, Ryan AM, Thumma JR, Dimick JB. Association of hospital participation in a quality reporting program with surgical outcomes and expenditures for Medicare beneficiaries. JAMA. 2015;313:496–504
7. Etzioni DA, Wasif N, Dueck AC, Cima RR, Hohmann SF, Naessens JM, Mathur AK, Habermann EB. Association of hospital participation in a surgical outcomes monitoring program with inpatient complications and mortality. JAMA. 2015;313:505–11
8. Kocher R, Emanuel EJ, DeParle NA. The Affordable Care Act and the future of clinical medicine: the opportunities and challenges. Ann Intern Med. 2010;153:536–9
9. Britt LD, Hoyt DB, Jasak R, Jones RS, Drapkin J. Health care reform: impact on American surgery and related implications. Ann Surg. 2013;258:517–26
10. Burwell SM. Setting value-based payment goals—HHS efforts to improve U.S. health care. N Engl J Med. 2015;372:897–9
11. Werner RM, Kolstad JT, Stuart EA, Polsky D. The effect of pay-for-performance in hospitals: lessons for quality improvement. Health Aff (Millwood). 2011;30:690–8
12. Jha AK, Joynt KE, Orav EJ, Epstein AM. The long-term effect of premier pay for performance on patient outcomes. N Engl J Med. 2012;366:1606–15
13. Shih T, Nicholas LH, Thumma JR, Birkmeyer JD, Dimick JB. Does pay-for-performance improve surgical outcomes? An evaluation of phase 2 of the Premier Hospital Quality Incentive Demonstration. Ann Surg. 2014;259:677–81
14. Findlay S. Physician Compare. Health Aff (Millwood): Project HOPE. 2014 Available at: http://www.healthaffairs.org/healthpolicybriefs/brief.php?brief_id=131. Accessed March 2015
15. Koltov MK, Damle NS. Health policy basics: physician quality reporting system. Ann Intern Med. 2014;161:365–7
16. Doherty RB. Goodbye, Sustainable Growth Rate-Hello, Merit-Based Incentive Payment System. Ann Intern Med. 2015;163:138–9
17. Qualified Clinicial Data Registry (QCDR) Participation Made Simple. 2015 CMS. Available at: https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/PQRS/Downloads/2015PQRS_QCDR_MadeSimple.pdf. Accessed March 2015
18. Review and update of guidance for evaluating evidence and measure testing. National Quality Forum. 2013 Available at: http://www.qualityforum.org/Publications/2013/10/Review_and_Update_of_Guidance_for_Evaluating_Evidence_and_Measure_Testing_-_Technical_Report.aspx. Accessed March 2015
19. Glance LG, Kellermann AL, Hannan EL, Fleisher LA, Eaton MP, Dutton RP, Lustik SJ, Li Y, Dick AW. The impact of anesthesiologists on coronary artery bypass graft surgery outcomes. Anesth Analg. 2015;120:526–33
20. Cardiac Surgery Report, Adult—Instructions and Data Element Definitions. 2009 Rennsselaer, NY New York State Department of Health
21. Hannan EL, Cozzens K, King SB III, Walford G, Shah NR. The New York State cardiac registries: history, contributions, limitations, and lessons for future efforts to assess and publicly report healthcare outcomes. J Am Coll Cardiol. 2012;59:2309–16
22. Hannan EL, Siu AL, Kumar D, Kilburn H Jr, Chassin MR. The decline in coronary artery bypass graft surgery mortality in New York State. The role of surgeon volume. JAMA. 1995;273:209–13
23. Royston P, Altman DG. Regression using fractional polynomials of continuous covariates: parsimonious parameteric modeling. Appl Stat. 1994;43:429–67
24. Merlo J, Chaix B, Ohlsson H, Beckman A, Johnell K, Hjerpe P, Råstam L, Larsen K. A brief conceptual tutorial of multilevel analysis in social epidemiology: using measures of clustering in multilevel logistic regression to investigate contextual phenomena. J Epidemiol Community Health. 2006;60:290–7
25. Martin BI, Mirza SK, Franklin GM, Lurie JD, MacKenzie TA, Deyo RA. Hospital and surgeon variation in complications and repeat surgery following incident lumbar fusion for common degenerative diagnoses. Health Serv Res. 2013;48:1–25
26. Sanagou M, Wolfe R, Forbes A, Reid CM. Hospital-level associations with 30-day patient mortality after cardiac surgery: a tutorial on the application and interpretation of marginal and multilevel logistic regression. BMC Med Res Methodol. 2012;12:28
27. Ash AS SM, Pekoz EA, Hanchate ADIezzoni L. Comparing outcomes across providers. In: Risk Adjustment for Measuring Health Care Outcomes. 20134th ed Chicago, IL Health Administration Press:335–78
28. Papachristofi O, Sharples LD, Mackay JH, Nashef SA, Fletcher SN, Klein AAAssociation of Cardiothoracic Anaesthetists (ACTA); Association of Cardiothoracic Anaesthetists ACTA. Association of Cardiothoracic Anaesthetists (ACTA); Association of Cardiothoracic Anaesthetists ACTA. . The contribution of the anaesthetist to risk-adjusted mortality after cardiac surgery. Anaesthesia. 2016;71:138–46
29. Papachristofi O, Mackay JH, Powell SJ, Nashef SA, Sharples L. Impact of the anesthesiologist and surgeon on cardiac surgical outcomes. J Cardiothorac Vasc Anesth. 2014;28:103–9
30. Dimick JB, Staiger DO, Birkmeyer JD. Ranking hospitals on surgical mortality: the importance of reliability adjustment. Health Serv Res. 2010;45:1614–29
32. Huffman KM, Cohen ME, Ko CY, Hall BL. A comprehensive evaluation of statistical reliability in ACS NSQIP profiling models. Ann Surg. 2015;261:1108–13
33. Shahian DM, He X, Jacobs JP, Kurlansky PA, Badhwar V, Cleveland JC Jr, Fazzalari FL, Filardo G, Normand SL, Furnary AP, Magee MJ, Rankin JS, Welke KF, Han J, O’Brien SM. The Society of Thoracic Surgeons composite measure of individual surgeon performance for adult cardiac surgery: a report of the Society of Thoracic Surgeons Quality Measurement Task Force. Ann Thorac Surg. 2015;100:1315–24
34. Kahn CN III, Ault T, Potetz L, Walke T, Chambers JH, Burch S. Assessing Medicare’s hospital pay-for-performance programs and whether they are achieving their goals. Health Aff (Millwood). 2015;34:1281–8

Supplemental Digital Content

Back to Top | Article Outline
© 2016 International Anesthesia Research Society