Discrepancies Created by Surgeon Self-Reported Operative Time and the Effects on Procedural Relative Value Units and Reimbursement : Obstetrics & Gynecology

Secondary Logo

Journal Logo

Contents: Original Research

Discrepancies Created by Surgeon Self-Reported Operative Time and the Effects on Procedural Relative Value Units and Reimbursement

Uppal, Shitanshu MBBS; Rice, Laurel W. MD; Spencer, Ryan J. MD, MS

Author Information
Obstetrics & Gynecology 138(2):p 182-188, August 2021. | DOI: 10.1097/AOG.0000000000004467

OBJECTIVE: 

To demonstrate discrepancies between operative times in the ACS NSQIP (American College of Surgeons National Surgical Quality Improvement Project) and self-reported operative time from the American Medical Association's Relative Value Scale Update Committee (RUC) and their effect on relative value units (RVU) determination.

METHODS: 

This is a cross-sectional review of registry data using the ACS NSQIP 2016 Participant User File and the Centers for Medicare & Medicaid Services physician procedure time file for 2018. We analyzed total RVUs for surgeries by operative time to calculate RVU per hour and stratified by specialty. Multivariate regression analysis adjusted for patient comorbidities, age, length of stay, and ACS NSQIP mortality and morbidity probabilities. The surgeon self-reported operative times from the Centers for Medicare & Medicaid Services physician were compared with operative times recorded in the ACS NSQIP, with excess time from RUC estimates termed “overreported time.”

RESULTS: 

Analysis of 901,917 surgeries revealed a wide variation in median RVU per hour between specialties. Orthopedics (14.3), neurosurgery (12.9), and general surgery (12.1) had the highest RVU per hour, whereas gynecology (10.2), plastic surgery (9.5), and otolaryngology (9) had the lowest (P<.001 for all comparisons). These results remained unchanged on multivariate regression analysis. General surgery had the highest median overreported operative time (+26 minutes) followed by neurosurgery (+23.5 minutes) and urology (+20 minutes). Overreporting of the operative time strongly correlated to higher RVU per hour (r=0.87, P=.002).

CONCLUSION: 

Despite reliable electronic records, the AMA-RUC continues to use inaccurate self-reported RUC surveys for operative times. This results in discrepancies in RVU per hour (and subsequent reimbursement) across specialties and a persistent disparity for women-specific procedures in gynecology. Relative value unit levels should be based on the available objective data to eliminate these disparities.

Congress initiated the relative value unit (RVU) system for Medicare in 1992 as an attempt to standardize reimbursement across varied medical procedures and as a potential cost-containment measure due to rapidly increasing health care costs and expenditures. Of the three components of the RVU system, work RVU is the critical piece in determining procedure reimbursement. After an adjustment for practice geography based on the cost of living and operating costs, the amount reimbursed for a procedure is the product of the work RVU and a conversion factor. The methodology of how work RVUs are calculated and their effect on reimbursement previously has been discussed.1

One of the most important components used in determining work RVU is the operative time. These time estimates are obtained by the Relative Value Scale Update Committee (RUC) from relatively small surveys sent to physicians among the various specialties. These surveys ask a battery of questions in an attempt to obtain information regarding the length of a procedure and its relative work effort required. Importantly, the vast majority of variation in the RVU assignment is due to procedure time.2 Even the best-intentioned respondents may be subconsciously biased in their responses and influence the survey and subsequent work RVU assessments. Studies confirm that physician time estimates are, on average, longer than times taken from data sets that record operative time.3,4 Because of this subjectivity, these work RVU assessments can lead to errors and disparities in reimbursement across surgical subspecialties.

Given this background, the aim of this study was to use the ACS NSQIP (American College of Surgeons National Surgical Quality Improvement Project) data files to explore potential discrepancies between operative times and ACS NSQIP physician-reported operative time from the RUC surveys. The second aim of this study was to analyze discrepant procedural times for correlation to potential disparities in the assignment of relative work RVUs by surgical subspecialty. We hypothesized that there would be significant differences in objective and physician-reported operative times and that these differences would correlate to the underpayment of certain surgical specialties.

METHODS

Data from the ACS NSQIP participant user file for 2016 were used for this study. The ACS NSQIP collects data on surgical patients from participating hospitals from across the United States. Details of the sampling strategy, data abstraction procedures, and outcomes in the ACS NSQIP have been documented extensively.5 Data reported by the ACS NSQIP are deidentified; therefore, this study is exempt, per the University of Michigan institutional review board policies.

The Centers for Medicare & Medicaid Services physician procedure fee schedule final rule for the year 2018 was used for this cross-sectional study.6 These rules were placed in the federal register in November 2017. Given the time lag between the RUC recommendations and the Centers for Medicare & Medicaid Services finalizing the procedural RVUs, the survey period of these rules likely represent the surveys performed in 2016, which corresponds to the ACS NSQIP file used in this study.

Operative times in the ACS NSQIP file represent incision to skin closure time in minutes. This variable is collected by the data abstractors from the actual operative notes. To eliminate cases in which errors might have occurred during data abstraction or unusual cases with high complexity resulting in long procedure time, we excluded cases in which the operative time was less than the 5th percentile or more than 95th percentile for each procedural using Current Procedural Terminology (CPT) codes.

Each case in the ACS NSQIP file has a primary procedure denoted by its CPT code and associated work RVUs. In addition to the primary procedure, additional procedures performed by the same surgeon are listed under “Other CPT codes” along with their work RVU units. For example, a patient undergoing “partial removal of colon, low pelvic anastomosis” as the procedure would have 44145 as the primary CPT code. In addition, the ACS NSQIP file provides the associated work RVUs (28.58) for this procedure. In this instance, if the patient also underwent oophorectomy by the same surgeon, the procedure would be listed under “Other CPT 1” as 57720 with 12.16 work RVUs. In this case, the Centers for Medicare & Medicaid Services reimburses the primary procedure at its full work RVU value, but each subsequent code is reimbursed at 50% of its work RVU. In this instance, the total work RVUs would be 28.58+(12.16/2)=34.66.

However, if the oophorectomy was not performed by the same surgeon, the file contains “concurrent CPT” codes. These codes reflect procedures performed by a different surgeon. The ACS NSQIP file only provides the total operative time for the procedure and not for each surgeon. We therefore excluded cases in which any concurrent CPT codes were listed, because it is impossible to determine the time taken by each individual surgeon from the total time provided in the file.

We divided the calculated total RVUs by the operative time (converted to hours) to generate our primary outcome metric: RVUs reimbursed per hour (RVU/h). Aggregate measures of RVU per hour were calculated for each surgical specialty.

Multivariate regression analysis was performed to ensure that discrepancy in RVU per hour was not a reflection of patient complexity. Two variables were used for risk adjustment. First, we used the ACS NSQIP provided morbidity probability in the 2016 participant user file. This probability is already available in the ACS NSQIP data set and provides the probability of morbidity in each patient calculated based on complex risk models created from the entire ACS NSQIP data set. In addition, we used the length of hospital stay (as a continuous variable) in the regression model. We used these two variables because they reflect the complexity of a case postoperatively compared with surgical time, which reflects the complexity during the operation. We then used Stata's MARGINS command to generate predicted probabilities.

The Centers for Medicare & Medicaid Services Physician Fee Schedule Relative Value Files provide several time estimates based on the RUC recommendations for each CPT code. Reported times include preoperative evaluation time, scrub time, patient positioning time, and the intraoperative time. Overreported time was calculated as the difference between the Centers for Medicare & Medicaid Services file estimated intraoperative time and the ACS NSQIP actual operative time, for each procedure. For example, if a procedure is estimated to last 45 minutes in the Centers for Medicare & Medicaid Services file but the actual operative time is 40 minutes, the overreported time for this case would be 5 minutes. A negative value would represent a case that took longer than the time estimated by the Centers for Medicare & Medicaid Services file.

To calculate the change in procedural reimbursement, we calculated the RVU difference for each procedure by comparing reimbursement rates between the 2014 and 2018 Centers for Medicare & Medicaid Services Physician Fee Schedule Relative Values. We then calculated an increase (or decrease) in the overall reimbursement rate by the surgical specialty.

RESULTS

A total of 901,917 cases were included in the surgery after exclusions. Surgeries from the following disciplines were available for analysis: general surgery, orthopedics, gynecology, urology, vascular surgery, neurosurgery, otolaryngology, plastic surgery, and thoracic surgery. Figure 1 provides the flowchart diagram of patients excluded at each stage of the query. The majority of exclusions were for procedures with more than one type of surgeon (n=35,799) and for procedures that are performed too infrequently to inform the study question and hypothesis (n=58,097). Table 1 provides the number of cases per surgical specialty included in this study as well as their relative percentage and cumulative percentage. The majority of the cases were performed by general surgeons (45.5%) and orthopedic surgeons (24.3%). Gynecologic surgeons accounted for only 7.8% of procedures, and in this analysis, thoracic surgeons performed the fewest procedures (0.9%).

F1
Fig. 1.:
Flowchart of surgical cases included for analysis.Uppal. Operative Times and Procedural Relative Value Units. Obstet Gynecol 2021.
T1
Table 1.:
Total Number of Cases Included per Surgical Specialty

There was a wide and statistically significant variation in median RVU per hour among the nine surgical specialties. Orthopedics (14.3), neurosurgery (12.9), and general surgery (12.1) had the highest RVU per hour, whereas gynecology (10.2), plastic surgery (9.5), and otolaryngology (9) had the lowest (P<.001 for all comparisons, Fig. 2).

F2
Fig. 2.:
Median relative value units based on surgical specialty. ENT, otolaryngology.Uppal. Operative Times and Procedural Relative Value Units. Obstet Gynecol 2021.

Surgeon specialty also was significantly associated with RVU reimbursement. Risk-adjusted RVU reimbursement per hour spent in the operating room is shown in Figure 3. For their average cases, neurosurgery was reimbursed at the highest rate of 26.65 (95% CI 26.51–26.65), with otolaryngology reimbursed at the lowest rate of 14.84 (95% CI 14.74–14.94). Gynecology fell in the middle of the specialties analyzed with 17.95 (95% CI 17.90–18.00).

F3
Fig. 3.:
Risk-adjusted relative value reimbursement per hour based on surgical specialty. ENT, otolaryngology.Uppal. Operative Times and Procedural Relative Value Units. Obstet Gynecol 2021.

The analysis of how much time each surgical specialty overreported their surgical time on the RUC, compared with the ACS NSQIP surgical time, is reported in Figure 4. General surgery had the highest median overreported operative time (+26 minutes), followed by neurosurgery (+23.5 minutes), and urology (+20 minutes). Gynecology had the lowest median overreported operative time among the nine surgical specialties analyzed at +5 minutes. Overreporting of the operative time by specialty strongly correlated to higher RVU per hour (r=0.87, P=.002). Details of RUC time, compared with the median ACS NSQIP times, for the top three surgical procedures for each specialty are provided in Table 2.

F4
Fig. 4.:
Median overreported time vs Centers for Medicare & Medicaid Services 2018 operative time values. ENT, otolaryngology.Uppal. Operative Times and Procedural Relative Value Units. Obstet Gynecol 2021.
T2
Table 2.:
Comparison of Top Three Procedures by Each Specialty

DISCUSSION

The use of small surveys of physicians to estimate service time for procedures raises legitimate concerns about the surveys’ accuracy. For example, one study of RUC time estimates noted that there were only 58 respondents per procedure, with an overall response rate of 21%. This resulted in time estimates that were nearly 20% different than objective benchmark times.7 The response rate for the 2015 survey was only 2.2%.8 Because the Centers for Medicare & Medicaid Services historically accepts approximately 90% of the RUC recommendations,9 these surveys, and their accuracy, are a critical component in deciding alterations for RVUs.

Physician time estimates can be biased by numerous factors: academic practice with learners compared with private practice, time in practice, surgical volume, and the complexity of patient populations among others. Although there may be a subconscious bias to inflating surgical times, there is no evidence that operative time inflation is purposeful. Additionally, estimating time simply is a difficult skill. In 2020, the medical community and policy makers have much better data sources available for surgical time than physician estimates that can eliminate subjectivity altogether. The ACS NSQIP data set offers a blended benchmark of incision-to-closure procedural time that, although it may not look like every surgeon's individual practice, at least represents a reasonable average estimate across a wide swath of practices and patient populations. Certainly, this is more representative than estimates from the few who respond to the surveys.

From the very beginning of the current RVU system, procedures for women have been undervalued. In 1996, just four years after the current system was in place, researchers demonstrated that obstetric and gynecologic services received lower valuations than services for urology and general surgery.10 Goff et al examined urologic oncology procedures in comparison with gynecologic oncology procedures and created similar matched procedure pairs. For the 24 matched procedures, 19 of the urologic procedures received a higher reimbursement (79%, P=.004). Only two of the 24 received equivalent reimbursement. Overall, male gender-related surgical procedures generated RVUs that were 50% greater than female gender-related procedures.11

In 2017, Benoit and colleagues explored whether value and reimbursement disparities for female oncology-related procedures persisted in the 20 years since Goff et al's report. They compared 17 pairs of “minor” matched procedures from urologic oncology and gynecologic oncology, and 33 pairs of “major” matched procedures. Of the 50 procedure pairs, 36 (72%) of the male-specific procedures received higher RVU levels (P=.026). This led to 42 (84%) of the male-specific procedures being reimbursed at a higher level (P=.003). Our current report finds that this discrepancy still persists with urology surgeons getting reimbursed at an average 0.5 RVU higher per hour when compared with their gynecology colleagues. Finally, Chan et al7 analyzed differences in surgical time estimates from the RUC surveys and the ACS NSQIP data files. They reported that between 2011 and 2015, using RUC surveys resulted in payments $20 million lower for obstetric–gynecologic procedures than if procedure times from ACS NSQIP had been used. In that study, it was estimated that orthopedic surgery (+$160 million) and urologic surgery (+$40 million) received significant overpayments due to the use of the RUC surveys, compared with ACS NSQIP data.12

Childers et al13 also assessed whether objective work measures from the ACS NSQIP data files were associated with assigned work RVUs and evaluated for discrepancies. Their model incorporated ACS NSQIP operative time, postoperative hospital length of stay, 30-day readmission rates, and 30-day reoperation rates. They found that 49% of gynecology surgeries received lower than expected work RVUs but that assigned work RVUs for gynecology surgery were on par compared with general surgery procedures. In that study orthopedics, urology, plastic surgery, and otolaryngology surgeries were assigned work RVUs that were lower than the general surgery procedures. There are several reasons our results are different from this paper. First, Childers et al used procedures with a single CPT code only for their entire analysis and risk adjustment model, where we used all the secondary CPT codes used in an operation by a single surgeon and assigned total RVUs using Centers for Medicare & Medicaid Services methodology. By including procedures with secondary CPT codes, we estimate that Childers et al excluded nearly 40% of potential evaluable cases. Second, we did not use readmission as well as reoperation as a measure of surgical complexity in our model because both of these measures are not an accurate reflection of work performed by the surgeon at the time of operation.14

Strengths of this study include the objective data and operative times from the ACS NSQIP files and the Centers for Medicare & Medicaid Services. Also, the data represent a large representative sample of multiple different types of surgical practices and, therefore, is appropriately generalizable for the average experience across the country. Limitations include that the data cannot identify contributing factors to work RVUs other than physician-reported time. Additionally, other RVU components (practice expense and malpractice) were not included for analysis.

This report serves as more evidence that the RUC procedure times should be based on available objective registry data rather than subjective time estimates from surveys completed by a small number of participants with poor response rates. This assertion is supported by a pilot study performed and published by the Centers for Medicare & Medicaid Services that identified “differentially inflated time and work values throughout the [Physician Fee Schedule], causing inconsistently inaccurate payment rates.”15 Since the implementation of the current physician procedural reimbursement system, this report lends additional data to prior observations that procedures for women have been undervalued and reimbursed inequitably. Numerous studies spanning decades demonstrate that not only have these discrepancies persisted, but in some instances may be worsening. We acknowledge that incorporating new systems into the existing structure may be challenging and require a lot of time and effort to consider new structural frameworks; however, allowing persistent inaccuracies and disparities to continue does not produce as high a quality, reliable, and reproducible system of care that health care professionals and patients deserve.

REFERENCES

1. Uppal S, Shahin MS, Rathbun JA, Goff BA. Since surgery isn't getting any easier, why is reimbursement going down? An update from the SGO taskforce on coding and reimbursement. Gynecol Oncol 2017;144:235–7. doi: 10.1016/j.ygyno.2016.06.008
2. Wynn BO, Burgette LF, Mulcahy AW, Okeke EN, Brantley I, Iyer N, et al. Development of a model for the validation of work relative value units for the Medicare physician fee schedule. Rand Health Q 2015;5:5.
3. Burgette LF, Mulcahy AW, Mehrotra A, Ruder T, Wynn BO. Estimating surgical procedure times using anesthesia billing data and operating room records. Health Serv Res 2017;52:74–92. doi: 10.1111/1475-6773.12474
4. McCall N, Cromwell J, Braun P. Validation of physician survey estimates of surgical time using operating room logs. Med Care Res Rev 2006;63:764–77. doi: 10.1177/1077558706293635
5. American College of Surgeons. About ACS NSQIP. Accessed October 25, 2019. Available at: https://www.facs.org/quality-programs/acs-nsqip/about
6. Centers for Medicare & Medicaid Services. CY 2018 revisions to payment policies under the physician fee schedule and other revisions to Part B. Accessed October 29, 2018. Available at: https://www.cms.gov/Medicare/Medicare-Fee-for-Service-Payment/PhysicianFeeSched/PFS-Federal-Regulation-Notices-Items/CMS-1676-F.html
7. Chan DC, Huynh J, Studdert DM. Accuracy of valuations of surgical procedures in the Medicare fee schedule. N Engl J Med 2019;380:1546–54. doi: 10.1056/NEJMsa1807379
8. Government Accountability Office. GAO-15-434: Medicare physician payment rates: better data and greater transparency could improve accuracy. Accessed March 2, 2020. Available at: https://www.gao.gov/assets/680/670366.pdf
9. Laugesen MJ, Wada R, Chen EM. In setting doctors' Medicare fees, CMS almost always accepts the relative value update panel's advice on work values. Health Aff 2012;31:965–72. doi: 10.1377/hlthaff.2011.0557
10. Cherouny P. Underreimbursement of obstetric and gynecologic invasive services by the resource-based relative value Scale. Obstet Gynecol 1996;87:328–31. doi: 10.1016/0029-7844(95)00442-4
11. Goff BA, Muntz HG, Cain JM. Comparison of 1997 Medicare relative value units for gender-specific procedures: is Adam still worth more than Eve? Gynecol Oncol 1997;66:313–9. doi: 10.1006/gyno.1997.4775
12. Benoit MF, Ma JF, Upperman BA. Comparison of 2015 Medicare relative value units for gender-specific procedures: gynecologic and gynecologic-oncologic versus urologic CPT coding. Has time healed gender-worth? Gynecol Oncol 2017;144:336–42. doi: 10.1016/j.ygyno.2016.12.006
13. Childers CP, Dworsky JQ, Russell MM, Maggard-Gibbons M. Association of work measures and specialty with assigned work relative value units among surgeons. JAMA Surg 2019;154:915–21. doi: 10.1001/jamasurg.2019.2295
14. Uppal S, Spencer RJ, Rice LW, Del Carmen MG, Reynolds RK, Griggs JJ. Hospital readmission as a poor measure of quality in ovarian cancer surgery. Obstet Gynecol 2018;132:126–36. doi: 10.1097/AOG.0000000000002693
15. Zuckerman S, Merrell K, Berenson R, Mitchell S, Upadhyay D, Lewis R. Collecting empirical physician time data: piloting an approach for validating work relative value units. Accessed February 15, 2021. Available at: https://www.cms.gov/Medicare/Medicare-Fee-for-Service-Payment/PhysicianFeeSched/Downloads/Collecting-Empirical-Physician-Time-Data-Urban-Report.pdf
FU1
Figure

Supplemental Digital Content

© 2021 by the American College of Obstetricians and Gynecologists. Published by Wolters Kluwer Health, Inc. All rights reserved.