Modifications in physician billing practices occur in response to payment incentives, sometimes independently from measurable changes in the actual care being delivered.1,2 Ideally, such changes reflect improvements in physician billing compliance that lead to a decrease in the amount of unrealized compensation for appropriately delivered care. However, previous investigators have suggested that associations between billing practices and payment incentives are not always explainable by increased billing fidelity. For example, the occurrence of billing patterns consistent with deliberate upcoding—possibly in pursuit of higher reimbursements—has been rigorously demonstrated in a forensic analysis of general outpatient office visits covered under Medicare Part B.3 In addition, the phenomenon of “code creep” has been observed, in which encounters that are reimbursed according to care intensity sometimes manifest an otherwise unexplained trend increase in reported care intensity over time.4,5
Within the field of anesthesiology, private US insurers typically reimburse for intraoperative anesthesia services as a function of the sum of base units, modifiers for intensity of care delivered, and time units, with the per-unit dollar conversion varying depending on negotiated contracts between payers and provider groups. Care intensity is determined, in part, by a modifier that accounts for increasing ASA physical status (ASAPS) scores recorded by the primary anesthesia provider.6 Under the typical arrangement, conventional intraoperative care for patients considered to have a more severe comorbid burden (i.e., ASAPS scores of III–V) commands additional reimbursement compared with care for relatively healthy patients scored as ASAPS I or II. This payment model with a step increase for patients above ASAPS II, although common among private payers, contrasts with reimbursements under Medicare, in which no increases in anesthesiology payments are offered for higher ASAPS scores.
Taking advantage of the large and sudden increase in the percentage of the US population covered by Medicare that is known to occur at age 65 years7 and the associated loss of financial incentives for coding ASAPS as greater than II that also occurs at age 65 years, we designed a quasi- experimental study to determine whether there was evidence for systematic upcoding of ASAPS in response to payer incentives that ceases at age 65 years using a regression discontinuity design.8–10 A quasi-experimental study typically refers to any study method seeking to infer causal relationships in the absence of random assignment.11,12 As further described later, the regression discontinuity design has been commonly used in the social sciences to observe changes in an outcome (in this case, ASAPS rankings) that occur on either side of a relatively sharp cut point (in this case, the proportion of patients insured by Medicare at age 65 years). We hypothesized that, coincident with the onset of widespread Medicare eligibility, a discontinuity in ASAPS scores would be observed in the case of the nondeferrable conditions of hip, femur, or lower leg fracture repair. Such discontinuity would be observed after controlling for the underlying trend of increasing ASAPS scores with increasing age, as well as any gender effects.
This study analyzed a broadly available, de-identified database that does not contain identifiable health information and was therefore exempt from IRB approval. With assistance from the Anesthesia Quality Institute (AQI), data were abstracted from the National Anesthesia Clinical Outcomes Registry (NACOR) Participant User File 2014, Quarter 1. The NACOR data set has been used previously in several reports describing national anesthesia usage patterns and outcomes.13–15 We limited our analysis to patients 50 to 79 years of age undergoing surgery for the treatment of a fractured hip, femur, or lower leg because these are overwhelmingly nondeferrable conditions.7 This group was identified based on the presence of Clinical Classification Software (CCS) codes 146 or 147. CCS codes, formerly known as Clinical Classifications for Health Policy Research,16 are used to identify clinically relevant diagnostic groups in a variety of health services research contexts and have been shown to compare favorably with other widely used diagnostic and comorbidity classification systems.17,18 The full extraction schema for the present data analysis, including an accounting of missingness, is illustrated in Figure 1.
Regarding the nondeferrable nature of the repair of hip, femur, and lower leg fractures, this was desirable because previous research has identified that, among conditions that are deferrable, a large increase in the number of care visits occurs in the United States at age 65 years in response to Medicare eligibility and partially in response to the decline in the percentage of the population that is uninsured.7 In Figures 2 and 3, we demonstrate the importance of using a nondeferrable condition within the NACOR data set in the present analysis as follows: Figure 2 shows the number of patients by age and ASAPS score within our data set undergoing the nondeferrable repair of a hip, femur, or lower leg fracture (CCS codes 146 or 147), whereas Figure 3 demonstrates the number of patients by age and ASAPS score within NACOR undergoing the deferrable procedure of cataract repair (CCS 15). As can be seen for deferrable conditions, a large step increase in the number of patients undergoing cataract surgery enters the market after the onset of Medicare eligibility at age 65 years (Fig. 3). The step increase in the number of people seeking treatment for this deferrable condition raises the likelihood that the characteristics of people seeking cataract repair at age 65 years and older will be systematically different from those of people younger than 65 years undergoing the same procedure. Indeed, the literature suggests that patients who defer their health care usage until the age of 65 are typically those who previously lacked insurance and for whom out-of-pocket expenses correspondingly decrease substantially after Medicare enrollment.19,20 As out-of-pocket costs increase for many employer-sponsored plans, other cost-conscious patients also may defer surgeries when possible until the onset of Medicare coverage. If we included a surgical procedure that could be deferred until age 65 years, this situation would violate an important assumption of our regression discontinuity design, which requires that, other than the dramatic shift in payer mix at age 65 years that has been previously described7 and the known gradual increase in ASAPS scores that occurs with age and sex, the distribution of the population of patients slightly older than 65 years who undergo the surgical procedure should not systematically differ from that of patients slightly younger in respect to unmeasured factors that determine their ASAPS scores. To the extent that the earlier assumption holds, the degree of regression discontinuity in ASAPS scores associated with Medicare eligibility may reflect the degree of systematic upcoding that occurs in response to payer incentives and that stops in the absence of such incentives. If patients on either side of the age 65 threshold are reasonably similar in respect to unobserved characteristics that would affect the outcome of ASAPS, the resulting quasi-experimental population sample would share important characteristics with what is desired in samples produced from a randomized clinical trial.21
An additional assumption of our study design is that age data and the ASAPS groupings of I or II versus III, IV, and V in the study database accurately reflect the actual clinical data submitted by providers for reimbursement and that the data set was minimally contaminated by unanticipated subsets of patients with leg fractures that could theoretically be deliberately deferred until the attainment of Medicare eligibility. Accordingly, the AQI advised that case inclusion should be limited to the following target facilities: university hospitals, large community hospitals (>500 beds), medium community hospitals (100–500 beds), small community hospitals (<100 beds), and attached surgery centers. This excluded specialty hospitals, free-standing surgery centers, pain clinics, and surgeon offices, as these facilities would not be likely to treat significant numbers of nondeferrable leg fractures. To the extent that such facilities may treat leg fractures, clinical experience would suggest that the excluded population is likely to represent a markedly different group of patients compared with those treated in the acute care hospitals and attached facilities that were included.
Data for age and ASAPS scores from the above target facilities were extracted for analysis. To assess for regression discontinuity in ASAPS scores, we created a dichotomous variable of age ≥65 years as the indicator variable of discontinuity. We then estimated the effect of this variable on the level of ASAPS scores by ordinal logistic regression, including the following independent variables in the primary model: age centered at 65 years, age ≥65 years, sex, and the first-order interaction between centered age and age ≥65 years. Because of a paucity of ASAPS V patients and their likely heterogeneity with the remainder of the sample, we discarded this group of 25 patients before analysis. We confirmed graphically that the proportionality assumption was not violated. The logit of the ASAPS score was shown to be linearly associated with age, and polynomials in age were not considered in the final ordinal logistic regression analyses.
The significance of the regression coefficient for the age ≥65 variable was used to decide whether or not there was a discontinuity effect of Medicare eligibility on ASAPS scores. Next, to address the possibility that “gaming the system” is present primarily in private practices where physician salaries may be more directly affected by billing productivity compared with university systems, we conducted the same primary analysis in the 2 subgroups of university hospitals and in non-university hospitals. All data analyses were performed using SAS v9.3 (Cary, NC), and a 2-sided P value of <0.05 was considered statistically significant.
To reduce the potential impact of age at the 2 tails of its distribution on the ASAPS scores, the primary analysis was limited to patients whose ages ranged from 50 to 79 years. As a sensitivity analysis, we refit the ordinal logistic regression analyses to observations within a smaller bandwidth of 5 and then 10 years to either side of age 65 years (i.e., age 60–69 years and age 55–74 years). Second, we also fit a simpler logistic regression model in which we recoded the multilevel ASAPS score outcome into a binary variable (i.e., ASAPS I–II versus III–IV). Third, we sought to measure whether the extent of discontinuity varied among different facilities and regions. We therefore modified the primary ordinal regression analysis to include the region of the country and the specific facility identity as random effects. For this third sensitivity analysis, we included only the subset of facilities that performed at least 100 cases each. For this model, we used the SAS PROC GLIMMIX procedure, in which a random intercept effect was modeled to take into consideration the hierarchical structure of our data (i.e., patients were treated at different anesthesia facilities nested in different US regions). Last, to test the validity of our study design for detecting regression discontinuity at the age 65 Medicare threshold, we conducted simulations of various rates of occurrence of deliberate upcoding of ASAPS scores for eligible cases younger than 65 years in the data set. The simulations used the same logistic regression model as in the second sensitivity analysis above but enriched the data set with increasing proportions of randomized upcoding ranging from 0.5% to 4% of eligible cases. At each level of upcoding prevalence, 1000 independent iterations of the simulation were performed, and the P value (±95% confidence interval) of the age ≥65 variable was calculated.
Description and Representativeness of the Data
A total of 59,559 cases were extracted for the selected surgical procedure and age range. As shown in Figure 1, ASAPS score and age were available for 50,437 eligible cases. The distribution of ASAPS by age is illustrated in Figure 2. Of this group, 587 (1.2%) were removed from the analyses because of missing gender, resulting in a cohort of 49,850 patients with complete data on sex and ASAPS score available for inclusion in the final models.
For the 49,850 included patients, the mean age was 64.2 years (SD 8.6), with a median age of 64 years (interquartile range, 57–72); 62.3% were female, and the modal ASAPS (n = 20,940) was III. Because a very small number of subjects were available who scored as ASAPS V (n = 25), we removed these 25 patients from the analysis.
Regarding representativeness of the data, the data set comprises cases from the following types of institutions: university hospitals (n = 3570; 7.2%), large community hospitals >500 beds (n = 14,212; 28.5%), medium community hospitals between 100 and 500 beds (n = 28,461; 57.1%), small community hospitals (n = 2907; 5.8%), and attached surgery centers (n = 700; 1.4%). Geographically, the data set comprises patients from throughout the United States, including the Northeast (n = 7250; 14.5%), the Midwest (n = 16,128; 32.4%), the South (n = 19,194; 38.5%), and the West (n = 7278; 14.6%).
Analyses of Regression Discontinuity
Three models were tested in the primary ordinal logistic regression: the first using the entire data set as described earlier, the second including only university hospitals, and the third including only non-university hospitals. In each of these models, neither the age ≥65 variable (P = 0.68, 0.39, and 0.51, respectively) nor the first-order interaction term of age ≥65 × centered age (P = 0.66, 0.99, and 0.53, respectively) were statistically significant predictors of the odds of a change in the ASAPS score outcome when both these variables were included.
Because of the insignificance of the age × age ≥65 interaction term, we subsequently excluded this term from our models to determine whether the main effect of the age ≥65 variable was significant in predicting ASAPS scores. In parallel with the above 3 models, we again first used the entire data set (Table 1), then included only university hospitals (Table 2), and then included only non-university hospitals (Table 3). The age ≥65 variable was again not a significant predictor of ASAPS score (P = 0.71, 0.39, and 0.55, respectively). As shown in Tables 1 to 3, both age and sex were significant predictors in all 3 models, with increasing age and female sex both independently associated with increased odds of higher ASAPS scores (P < 0.0001 for age in the 3 models and P < 0.0001, P = 0.045, and P < 0.0001, for sex in the 3 models, respectively).
Results of Sensitivity Analyses
When multilevel ASAPS scores were reclassified into a binary outcome for analysis, similar results were found, that is, the age ≥65 variable was still found to be a statistically nonsignificant predictor (P = 0.30 and 0.86, respectively). Restricting the ordinal logistic regression analyses to observations within a narrow bandwidth of 5 or 10 years (i.e., age 60–69 years and age 55–74 years) of the cutoff point did not change the statistical inference for the age ≥65 years (see Table 4). Third, in a random effects model that accounted for regional and facility-specific correlations in ASAPS scoring, the conclusion again remained unchanged (see Table 5).
Collectively, these analyses provide evidence for a lack of significant discontinuity in the pattern of ASAPS scores at age 65 years within the present data. In our series of simulations of NACOR data that were deliberately enriched with increasing proportions of randomly upcoded cases, the results demonstrated our ability to detect deliberate upcoding of ASAPS scores occurring at rates exceeding 2% of eligible cases younger than 65 years, as indicated by the statistical significance of the age ≥65 variable illustrated in Figure 4.
In this study using the NACOR database, we found no evidence for a significant discontinuity in ASAPS scores in response to Medicare eligibility for the nondeferrable surgical conditions of hip, femur, or lower leg fracture repair. This finding is consistent with the conclusion that, among centers contributing to NACOR, there is no widespread upcoding of ASAPS scores that ceases at age 65 years in association with changes in payer incentives. If deliberate upcoding of ASAPS scores is present in our data, the behavior is either too rare or too insensitive to the removal of payer incentives at age 65 years to be evident in the present analysis.
While this study has the advantage of using a geographically and institutionally broad data source, our conclusions may not be generalizable to anesthesia practice groups that do not provide data to NACOR. Moreover, while the results do not support a widespread pattern of ASAPS score manipulation, it may be possible that there are dubious billing practices among a minority of anesthesia practitioners within NACOR or that other alterations of billing records could come to light using alternative methods of forensic analysis. Another limitation of our conclusion is that it is possible that a regression discontinuity is not observed simply because anesthesiologists are not aware of the Medicare rules regarding lack of reimbursement for ASAPS scores greater than II. Although our results are consistent with the conclusion that there is an absence of upcoding among anesthesiologists in response to payer incentives, they are equally consistent with the conclusion that there is a lack of knowledge among anesthesia practitioners regarding which patient bills would be affected by upcoding and which would not. Nevertheless, whether because of virtue, lack of knowledge, or a combination thereof, the lack of an observed regression discontinuity at age 65 years supports the notion that anesthesia providers overall are not engaged in duplicitous ASAPS scoring as a response to payer incentives that ceases when such incentives are no longer present. This conclusion remains valid, despite the known inter- and intraobserver variability in ASAPS rankings. The present study does not require that ASAPS be objective—merely that providers have control of rating ASAPS and that they may or may not exhibit behaviors that seem to be responsive to payment incentives in the course of these ratings.
Along with its use of a broad and diverse national database, this study has several strengths that stem from the regression discontinuity design itself. Specifically, while unmeasured, or poorly measured, confounding is possible in any study, the pseudo-random population created by the study method mitigates—although does not entirely eliminate—the possibility of such factors. Internal validity still rests on the assumption that, after accounting for the measured variables of age and sex, patients will not significantly differ in characteristics related to ASAPS scores within a reasonable proximity to the age 65 Medicare eligibility threshold. Our choice to limit the data set to patients undergoing the overwhelmingly nondeferrable procedures of hip, femur, or lower leg fracture repair helps to maintain this assumption.
An important limitation of our study, however, is that as our sample population extends over a larger-than-ideal age range to either side of the age 65 threshold. Particularly at the 2 tails of the age range, the mechanisms of injury and characteristics of the sample population may differ significantly. For example, younger male patients would be more likely to incur high-velocity traumatic femur fractures, whereas older female patients would be more likely to incur pathologic fractures common among the frail elderly patients. Acknowledging this potential source of bias, our analysis mitigated this potential confounder in 2 ways. First, all models incorporate both age and sex, which should serve to reduce bias because of systematic differences between younger males and older females. Second, the advantage of limiting the data analysis to a small age window on either side of the age 65 Medicare eligibility cutoff (15, 10, and 5 years in our several analyses) is to limit the effect of systematic differences in the populations on either side of the cutoff. In the theoretically ideal regression discontinuity scenario, a large enough data set with precise ages in months would allow for the comparison of patients at age 64 years and 11 months with patients at age 65 years and 1 month. Although such a data set was not available to us, future studies using a larger population size would allow for improvements on the present analysis because sufficient case numbers would enable the analysis to use a narrower window around the age 65 threshold and would enable the detection of a lower prevalence of upcoding, if it indeed were present. In regard to the question of the appropriate trade-off between a narrower sample window and a smaller sample size, our sensitivity analyses should provide some assurance to the robustness of the conclusion, given that the 2 models that explored narrower windows around age 65 years reached the same conclusion as in our primary analysis.
In conclusion, this study found no evidence to support the presence of systematic upcoding of ASAPS scores in response to payer incentives that disappear in conjunction with widespread Medicare eligibility.
Name: Robert B. Schonberger, MD, MHS.
Contribution: This author was responsible for study design, conduct of the study, data collection, data analysis, and manuscript preparation.
Attestation: Robert B. Schonberger approved the final manuscript, attests to the integrity of the original data and the analysis reported in this manuscript, and is the archival author.
Conflicts of Interest: None.
Name: Richard P. Dutton, MD, MBA.
Contribution: This author was responsible for data collection, data analysis, and manuscript preparation.
Attestation: Richard P. Dutton approved the final manuscript and attests to the integrity of the original data and the analysis reported in this manuscript.
Conflicts of Interest: Richard P. Dutton is the Director of the Anesthesia Quality Institute (AQI) from which the source database for this study was obtained. Dr. Dutton is also the Chief Quality Officer of the American Society of Anesthesiologists from which significant funding is provided to the AQI. The American Society of Anesthesiologists had no role in the authors’ decision to pursue the present study, the analytic plan, the manuscript produced, or the decision to submit it for publication.
Name: Feng Dai, PhD.
Contribution: This author was responsible for study design, conduct of the study, data collection, data analysis, and manuscript preparation.
Attestation: Feng Dai approved the final manuscript and attests to the integrity of the original data and the analysis reported in this manuscript.
Conflicts of Interest: None.
This manuscript was handled by: Franklin Dexter, PhD, MD.
1. Carter KA, Dawson BC, Brewer K, Lawson L. RVU ready? Preparing emergency medicine resident physicians in documentation for an incentive-based work environment. Acad Emerg Med. 2009;16:423–8
2. Kiran T, Victor JC, Kopp A, Shah BR, Glazier RH. The relationship between financial incentives and quality of diabetes care in Ontario, Canada. Diabetes Care. 2012;35:1038–46
3. Brunt CS. CPT fee differentials and visit upcoding under Medicare Part B. Health Econ. 2011;20:831–41
4. Chan B, Anderson GM, Thériault ME. Fee code creep among general practitioners and family physicians in Ontario: why does the ratio of intermediate to minor assessments keep climbing? CMAJ. 1998;158:749–54
5. Seiber EE. Physician code creep: evidence in Medicaid and State Employee Health Insurance billing. Health Care Financ Rev. 2007;28:83–93
6. Dripps RD, Lamont A, Eckenhoff JE. The role of anesthesia in surgical mortality. JAMA. 1961;178:261–6
7. Card D, Dobkin C, Maestas N Does Medicare Save Lives? Working Paper #13668. 2007 Cambridge, MA: National Bureau of Economic Research
8. De La Mata D. The effect of Medicaid eligibility on coverage, utilization, and children’s health. Health Econ. 2012;21:1061–79
9. Dugan J, Virani SS, Ho V. Medicare eligibility and physician utilization among adults with coronary heart disease and stroke. Med Care. 2012;50:547–53
10. Buchmueller TC, Grazier K, Hirth RA, Okeke EN. The price sensitivity of Medicare beneficiaries: a regression discontinuity approach. Health Econ. 2013;22:35–51
11. Cook T, Campbell D Quasi-Experimentation: Design and Analysis Issues For Field Settings. 1979 Chicago, IL Rand McNally
12. Gliner JA, Morgan GA Research Methods in Applied Settings: An Integrated Approach to Design and Analysis. 2000 New York, NY Taylor & Francis Group
13. Dexter F, Dutton RP, Kordylewski H, Epstein RH. Anesthesia workload nationally during regular workdays and weekends. Anesth Analg. 2015
14. Dutton RP. Making a difference: the Anesthesia Quality Institute. Anesth Analg. 2015;120:507–9
15. Nunnally ME, O’Connor MF, Kordylewski H, Westlake B, Dutton RP. The incidence and risk factors for perioperative cardiac arrest observed in the National Anesthesia Clinical Outcomes Registry. Anesth Analg. 2015;120:364–70
16. Cowen ME, Dusseau DJ, Toth BG, Guisinger C, Zodet MW, Shyr Y. Casemix adjustment of managed care claims data using the Clinical Classification for Health Policy Research method. Med Care. 1998;36:1108–13
17. Ash AS, Posner MA, Speckman J, Franco S, Yacht AC, Bramwell L. Using claims data to examine mortality trends following hospitalization for heart attack in Medicare. Health Serv Res. 2003;38:1253–62
18. Radley DC, Gottlieb DJ, Fisher ES, Tosteson AN. Comorbidity risk-adjustment strategies are comparable among persons with hip fracture. J Clin Epidemiol. 2008;61:580–7
19. McWilliams JM, Meara E, Zaslavsky AM, Ayanian JZ. Use of health services by previously uninsured Medicare beneficiaries. N Engl J Med. 2007;357:143–53
20. McWilliams JM, Zaslavsky AM, Meara E, Ayanian JZ. Impact of Medicare coverage on basic clinical services for previously uninsured adults. JAMA. 2003;290:757–64
21. Bor J, Moscoe E, Mutevedzi P, Newell ML, Bärnighausen T. Regression discontinuity designs in epidemiology: causal inference without randomized trials. Epidemiology. 2014;25:729–37