Editor’s Spotlight/Take 5: How Did Orthopaedic Surgeons Perform in the 2018 Centers for Medicaid & Medicare Services Merit-based Incentive Payment System?

Manner, Paul A. MD1

Clinical Orthopaedics and Related Research: January 2022 - Volume 480 - Issue 1 - p 4-7
doi: 10.1097/CORR.0000000000002073

As is often the case in the funding of healthcare, new efforts at cost containment and quality improvement introduce new programs with new acronyms, new documentation, and new requirements. This is particularly true for Pay-for-Performance (P4P) programs. The most recent iteration of P4P in the United States is the Merit-based Incentive Payment System (MIPS). The move to pay for performance, though, is not confined to one country, and it’s part of a broader effort to emphasize value-based healthcare—although how you define the latter depends on your point of view.

When I see new programs like MIPS, my first instinct is to ask, how is this going to affect me? I think I provide quality care, but will the people paying the bills agree? Am I being judged on what I do, and is that judging fair?

That’s why this month’s Spotlight article, “How Did Orthopaedic Surgeons Perform in the 2018 Centers for Medicaid & Medicare Services Merit-based Incentive Payment System?” [3], which looked at how orthopaedic surgeons are doing with MIPS, is so valuable. Atul F. Kamath MD and his team from the Cleveland Clinic Foundation asked how orthopaedic surgeons did compared to other surgical specialties, what practice-level and surgeon features were associated with good performance, and which features were associated with penalties for poor performance.

As a specialty, orthopaedic surgeons performed well in the second year of the MIPS, with 87% earning bonus payments. Still, Dr. Kamath and his group [3] also note that orthopaedic surgeons earned lower MIPS performance scores than did surgical colleagues from other specialties, were less likely to receive bonuses, and were more likely to be penalized under the system. After controlling for a number of confounding variables, they also found that orthopaedic surgeons in smaller practices and those who treated patients with higher levels of medical or social complexity had higher odds of receiving penalties and lower odds of earning a perfect MIPS score.

Some history will help put these results in context. Where did MIPS come from? In 1997, the Balanced Budget Act established the Sustainable Growth Rate as the statutory method for determining the annual updates to the Medicare physician fee schedule in the United States. Under the Sustainable Growth Rate formula, if a weighted combination of annual and cumulative expenditures was less than the spending target for the period, the annual update was increased according to an established calculation. However, if spending exceeded the weighted annual cumulative spending target over a certain period, future payments would be reduced.

For the first few years, all was well. But starting in 2002, the actual expenditure exceeded allowed targets, and the discrepancy grew steadily. Over the next 12 years, Congress passed 17 stopgap measures to avoid cuts to physician reimbursement. Eventually, in 2015, in a rare display of bipartisanship (a 392 to 37 vote in the House of Representatives and a 92 to 8 vote in the Senate), Congress passed the Medicare Access and CHIP Reauthorization Act, which President Obama signed into law.

That law required the Centers for Medicare & Medicaid Services (CMS) to establish business models that emphasized value over volume of services. CMS created two incentive-based business models, collectively referred to as the Quality Payment Program: MIPS and advanced alternative payment models—both of which involve levels of financial rewards and risks. Physicians could choose one or the other, based on practice size, specialty, location, or patient population, and most chose MIPS.

If I’m getting evaluated (and paid accordingly), I certainly want to know how. Here’s a brief snapshot: MIPS tracks data in four performance categories: cost, quality, improvement activities, and promoting interoperability. Each category is weighted and contributes to a MIPS-eligible clinician’s or group’s final score [9]. In 2020 (our ES/T5 study focused on 2018, but the weights were similar, as we’ll see below), those weights were: cost 15%, quality 45%, improvement 15%, interoperability 25%. Again, MIPS is P4P, writ large.

If one rewards physicians for good outcomes and penalizes for bad outcomes, there’s an added incentive to do the right thing. Pay for Performance should fit the bill perfectly. But, the evidence that P4P improves outcomes is sparse. A recent systematic review looked at over 10 years of research, in several countries, encompassing almost 3500 references at the outset, and found 69 studies with sufficient evidence and rigor [8]. They concluded that “Pay-for-Performance programs may be associated with improved processes of care in ambulatory settings, but consistently positive associations with improved health outcomes have not been demonstrated in any setting” [8].

There’s also a degree of skepticism that P4P focuses on appropriate measurement. One report [6] carried out a study in a national sample of internal medicine physicians after MIPS began in 2017. Although MIPS was new, and the authors were hopeful that it would work as advertised, the experience from previous iterations of P4P was not encouraging, with concern that the new program would result in unintended consequences. Most felt that physicians would focus on what was easily measured rather than on patients as a whole, that physicians or healthcare systems would avoid sicker or more medically complex patients to improve performance on quality or utilization measures, and that documentation would be geared to “checking the right boxes.” Many felt that providers would be prompted to “discourage patients from utilizing care in situations when it might be appropriate” [6].

P4P also hasn’t corrected for factors that are out of the physician’s control. In a recent cross-sectional study of almost 300,000 physicians in the first year of MIPS participation, physicians with the highest proportion of patients with lower incomes had lower MIPS scores [4]. And this was true for general surgeons as well: Byrd and Chung [2] found that “surgeons caring for patients at highest social risk received lower MIPS scores and had an increased risk of negative payment adjustment, despite ongoing efforts to target surgical disparities.” In short, MIPS may need substantial tweaking to ensure that physicians are treated fairly when they provide care in challenging settings.

As orthopaedic surgeons, we want to do what’s right for patients, but we should have some assurance that we’re assessed and rewarded accordingly. So, how are we doing? Are orthopaedic surgeons rising to the challenge of this new effort at providing value? And what is our incentive to do better?

For those answers and others, please join me in the Take 5 interview that follows for a conversation with Atul F. Kamath MD, senior author of “How Did Orthopaedic Surgeons Perform in the 2018 Centers for Medicaid & Medicare Services Merit-based Incentive Payment System?”

Take 5 Interview with Atul F. Kamath MD, senior author of “How Did Orthopaedic Surgeons Perform in the 2018 Centers for Medicaid & Medicare Services Merit-based Incentive Payment System?”

Paul A. Manner MD:You note that “… MIPS participation costs practices, on average, USD 12,811 and more than 53 hours per physician”[3]. This is the equivalent of an entire week of work in terms of hours and lost income. From a business viewpoint, it doesn’t seem to make sense to spend that much time and effort unless you can hit the maximum bonus (which is 4.69% this year) and your Medicare earnings warrant that effort, which would mean close to USD 300,000. If I’m a sports surgeon earning most of my income from non-Medicare sources, what’s my incentive? What might CMS do differently?

Atul F. Kamath MD: Although the MIPS is one of the largest value-based payment programs to date, it has been criticized [1] as adding a layer of administrative complexity to what is still fundamentally a fee-for-service system. Until either the burden of quality reporting decreases or the size of financial incentives increases, many practices may not see a benefit to participation. CMS has indicated that maximum payment adjustments will grow to 9% by the year 2022 [7], but it remains to be seen whether this will offset the cost of entry.

Dr. Manner:The set of quality measures for orthopaedics includes things like “Functional Status Change for Patients with Knee Impairments,” which is a patient-reported outcome, and seems reasonable. But it also includes process measures for “Rheumatoid Arthritis (RA): Glucocorticoid Management.” I provide care for patients with rheumatoid arthritis, but this doesn’t seem like something I should be doing. Are the questions and metrics requested appropriate? What specific quality measures would you recommend?

Dr. Kamath: Although CMS has pursued some degree of stakeholder engagement with the orthopaedic community, there is room for improvement. Most experts agree that meaningful quality measures should relate directly to a physician’s scope of practice or to the key patient outcomes for which they already feel responsible. Today, many surgeons participate in MIPS under the umbrella of large health systems, which default to standardized reporting of general measures for all physicians, regardless of specialty. Although specialty-focused measures have been developed, their utilization has been low. Understandably, many surgeons feel disconnected from the metrics by which their performance is being evaluated.

Dr. Manner:The stated goals of MIPS are to “drive improvement in care processes and health outcomes, increase the use of healthcare information, and reduce the cost of care.” But earlier P4P efforts have generally not done much. A recent systematic review found that “consistently positive associations with improved health outcomes have not been demonstrated in any setting”[8]. What’s better or different about MIPS? Does MIPS accomplish what it’s meant to do?

Dr. Kamath: Historically, P4P programs have focused on hospitals as the unit of measurement and comparison. However, the largest drivers of patient health outcomes have been recognized to exist outside the hospital. MIPS represents an expansion of P4P beyond the inpatient setting, to incentivize improvements in care at the level of individual physicians’ offices. Although there has been some evidence to suggest higher adherence to evidence-based processes, there has not been corresponding evidence of improved patient outcomes. Given the option, many practices selected to report only process measures in the initial years of MIPS. It will be important to see whether a transition to outcome measures in future years will produce a net benefit for patients and physicians alike.

Dr. Manner:In 2016, the graduating class of Harvard had a median GPA of 3.70, and 90% graduated with honors[5]. In your study, 87% of physicians received bonus payments. If that proportion are receiving honors, how meaningful are these measures as indicators of quality? I don’t want to denigrate either Harvard alumni or physicians, but is this the medical version of grade inflation? And if you’re already scoring at the top (or graduating with honors), what’s the incentive to improve?

Dr. Kamath: The threshold for bonus payments has been low in the initial years of the program, which CMS suggests has been to incentivize broader participation. It is true, however, that receiving an “honors” grade just for participation dilutes the meaning behind grading in the first place and does not justify the administrative costs of implementation. As the threshold for receiving bonuses continues to rise (75/100 points in 2021 to 85/100 points in 2022), the goal will be to increasingly base penalties and rewards on performance and create stronger incentives for improvement.

Dr. Manner:You note that “MIPS has also disproportionately penalized physicians treating socioeconomically disadvantaged and more medically complex patients”[3]. Is that the case in orthopaedics? And unless CMS resorts to a subjective or ad hoc adjustment (which could have unexpected consequences), how do we address that? What adjustments would make sense here? Since MIPS is budget neutral, are the 90% profiting at the expense of the 10%?

Dr. Kamath: This problem encompasses our field as well. Our study found that orthopaedic surgeons caring for medically and socially complex patients had twice the odds of receiving a penalty [3]. It may be the case that surgeons providing care for healthier and wealthier patients may be benefitting at the expense of their counterparts at tertiary referral centers or in the safety-net setting. Despite the urgency of appropriate risk adjustment, the administrative datasets available to CMS are limited with respect to validity and granularity. A low-cost starting point would be to use existing comorbidity indices and dual Medicare-Medicaid eligibility status as proxies. The eventual goal would be a transition to higher validity social determinant data and models specifically designed for risk adjustment in the setting of MIPS performance.


© 2021 by the Association of Bone and Joint Surgeons