Increasing rates of lumbar fusion surgery,1 combined with prolonged and costly recovery periods for some patients,2 have captured the attention of health care decision makers in pursuit of cost-effective patient care.3 Evidence has shown that the distribution of costs for individuals with back pain is highly disproportionate, with the top 10% costliest individuals accounting for nearly 100% of the dollars spent on inpatient care, 90% of emergency department expenditures, and 87% of outpatient care dollars.4 Given the effect of these high-cost users of health care and the transition of risk from payors to providers, there is growing interest in identifying patients most at risk for incurring high costs.3,5 Customized, coordinated, and innovative care pathways and/or procedures directed at these patients hold the promise of both improved patient quality of life and reduction in healthcare spending.5,6 Such an approach requires data-driven insights to augment clinical expertise for the accurate identification and remediation of the constellation of factors that might contribute to poor clinical and economic outcomes.7
A few published studies have used regression methods to predict high-cost users in spine surgery. These have largely relied upon patient demographics and clinical characteristics such as comorbidities as predictors. DeBerard et al (2003)2 used a regression model to predict compensation and medical costs in a workers’ compensation cohort who underwent lumbar fusion surgery. The study found that a presurgical diagnosis of depression, older age, number of previous low back operations, and lawyer involvement explained 24% of the variation in medical costs. A similar study was conducted by Wheeler et a (2012),3 and revealed that the time interval between injury and surgery, as well as the assignment of a nurse case manager, were predictive of medical costs. Finally, Passias et al (2018)8 used logistic regression to identify predictors of adverse discharge disposition and associated costs after adult spine deformity surgeries in nationwide and surgeon-created databases.
Various methods and data sources may be useful in elucidating distinct subgroups (profiles) of posterior lumbar spinal fusion surgical candidates who are likely to have high post-surgery costs. Health services utilization data, including administrative claims data, allow for assessment of disease and resource utilization patterns among patients across large, clinically representative “real-world” populations.9 Multidimensional clustering techniques using administrative claims data were applied by Hu et al (2012)5 to segment a diabetic population into groups with similar utilization profiles to identify groups of high utilizers and understand their general characteristics. Schiltz et al (2017)10 used machine-learning techniques (classification and regression trees and random forests) with Medicare claims data to identify specific combinations of chronic conditions, functional limitations, and geriatric syndromes associated with direct medical costs and inpatient utilization. Liao et al (2016)11 used cluster techniques to identify cost patterns for patients with end-stage renal disease initiated hemodialysis. However, no published studies have been identified which have explored associations between composite patterns of preoperative HCRU with postoperative economic outcomes for the spinal fusion population.
The application of cluster analysis techniques used in other therapeutic areas5,10,11 to the spine population could generate insights into opportunities to improve care and contain costs for patients with spinal fusion surgery. The objectives of this study were to use US commercial claims data to: quantify HCRU patterns among patients undergoing one- to two-level posterior spinal fusion; understand the association between composite patterns of HCRU in the 1 year before spinal fusion and 2-year payor costs (insurance payments) after spinal fusion; and identify distinct patient subgroups that might benefit from improved care pathways or surgical innovation.
MATERIALS AND METHODS
The study data source was US administrative claims data extracted from the IBM MarketScan® Commercial database. These databases comprise enrollment information, demographic information, inpatient medical, outpatient medical, and outpatient pharmacy claims data collected from >300 large self-insured US employers and >25 US health plans. The commercial database includes information for individuals who are under the age of 65 and are the primary insured or a spouse or dependent thereof. International Classification of Diseases, 9th and 10th Revision, Clinical Modification (ICD-9-CM and ICD-10-CM) diagnosis and procedure codes, as well as Current Procedural Terminology (CPT) codes, are available on medical claims across multiple settings of care.
The IBM MarketScan Commercial database was queried to identify patients who received posterior lumbar spinal fusion (CPT codes 22612, 22630, or 22633) between 2007 and 2016. Patients were required to have one- to two-level posterior fusion, be between 18 and 65 years of age, and to have been continuously enrolled in their health plan (medical and pharmacy) for 1-year pre-fusion and 2 years post-fusion. Patients were excluded if they had evidence of a revision fusion procedure, an anterior fusion at the time of posterior fusion surgery, or fusion of more than two spinal levels. Demographics (age and sex), diagnoses, procedure types, and the number of spinal levels fused were recorded for each patient. Both the Elixhauser Comorbidity Index and Functional Comorbidity Index were calculated to characterize comorbidity burden for each patient. The Elixhauser index has demonstrated excellent discriminative performance relative to other composite comorbidity indices,12 and the Functional Comorbidity Index provides further insight into physical function.13
One-year preoperative medical claims were categorized into HCRU groups comprising days of service (e.g., office visits or hospital days) for thoracolumbar (TL)-spine-related, other spinal care, and other medical services using a combination of CPT, ICD-9, and ICD-10 codes. Days of service, rather than costs, were used for HCRU categorization given that practitioners are more likely to have access to this information than to payment amounts at the point of care. A total of 187 HCRU categories (10 pharmacy categories and 177 medical categories) were considered. The 177 medical categories included 47 categories of medical services that the authors prespecified (per protocol) based on clinical interpretation of codes on each claim. These were augmented by inclusion of 130 standard service categories available within the native database. Pharmacy claims were further stratified to quantify days supplied for opioids (mild/moderate vs. strong), benzodiazepines, anticonvulsants (e.g., pregabalin), muscle relaxants, central nervous system agents, nonsteroidal anti-inflammatory drugs, antidepressants, and other analgesics, all of which are commonly used among patients with spinal pathology.14 Unadjusted differences in between-group HCRU (days of care or days of drugs supplied) were evaluated using the nonparametric Kruskal–Wallis test.
Hierarchical agglomerative clustering with Ward's linkage method (using a dissimilarity matrix comprised of between-subject Euclidean distances) was used as the basis for identification of preoperative HCRU clusters within the spinal fusion cohort. This is an unsupervised machine learning approach that merges observations into larger clusters with intent to reduce within-cluster variation and maximize between-cluster variation.15,16 We chose an unsupervised machine learning technique in light of our main objective to uncover patterns of medical resource utilization rather than predict any one prespecified outcome. Hierarchical clustering (HC) was selected from among other potential unsupervised machine learning techniques, as it does not require a priori assumptions about the nature and number of potential clusters. Unlike k-means or other techniques for clustering, it also provides clarity with respect to relationships between clusters, including splitting and merging algorithm behaviors.17
After removing 97 HCRU categories that each represented <2.5% of patients, 90 distinct HCRU variables remained for cluster analysis. Univariate Pearson correlation among all 90 variables was generally <0.8, so all variables were retained for clustering. Extreme values for both days of service and days supplied for medications were capped at the 99th percentile within each HCRU category, as these may reflect administrative errors. Because the magnitude of days of service for medical resources and days supplied for chronic medications differ markedly, HCRU variables were standardized (centered and scaled) before generating the dissimilarity matrix required for cluster analysis.
The number of relevant clusters was selected through visual inspection of dissimilarity between clusters per dendrogram output, average silhouette width, and clinician impression of clinical relevance.18 Silhouette width provides a measure of dissimilarity of patients (observations) relative to patients within the same cluster and other clusters. Silhouette width spans from 0–1, with values closer to 1 indicating a strong cluster structure, and values closer to zero representing absence of substantial cluster structure. Per cut-offs specified by Struyf et al (1997),19 we chose to limit our analysis to cluster solutions with average silhouette width values >0.25, as values below this threshold are thought to lack substantial structure.
The relevance of each preoperative cluster on post-fusion economic outcomes was ascertained by quantifying the influence of cluster membership on TL-spine-specific and all-cause payor costs ($US 2016) during the 2-year post-surgery period. Adjusted 2-year costs as a function of preoperative cluster membership were obtained using a generalized linear model (gamma distribution, log link) after controlling for patient age, sex, diagnosis at time of surgery, Elixhauser comorbidity index category, number of fused levels (one vs. two), and spinal fusion technique (posterior interbody-only, posterolateral-only, or combined approach). Analysis of variance was used to compare patient baseline characteristics between cohorts for continuous variables, whereas χ2 tests were used for categorical variables.
Baseline Demographics and Clinical Characteristics
The mean (SD) age for the overall sample was 51.3 (8.7) years and 56.1% of patients were female. Most patients underwent surgery in the inpatient setting (89.6%) and were discharged home afterwards (84.3%). The mean (SD) length of stay for inpatients was 3.8 (1.8) days, whereas median (IQR) length of stay was 4 (1–4) days. One-level fusion was more common than two-level fusion (79.4% vs. 20.6%) and over half of the patients had both posterolateral and interbody fusion (55.5%). Most patients had a Functional Comorbidity Index score of 1 to 2 (29.1%) or 3 to 5 (60.3%) and an Elixhauser comorbidity score of 0 (25.6%), 1 to 2 (52.1%), or 3 to 5 (20.2%). Individual Elixhauser comorbidities occurring in >10% of patients included uncomplicated hypertension (43.8%), uncomplicated diabetes (15.6%), depression (14.8%), chronic pulmonary disease (12.6%), hypothyroidism (11.6%), and rheumatic arthritis/collagen (10.2%) (Table 1; Baseline Characteristics by cluster category are available as Supplementary Table 1; http://links.lww.com/BRS/B465).
Four diagnoses comprised 73% of all primary diagnoses at the time of index spinal fusion surgery: displacement of lumbar intervertebral disc without myelopathy (21.2%); degeneration of lumbar or lumbosacral intervertebral disc (18.4%); acquired spondylolisthesis (17.1%); and spinal stenosis lumbar region without neurogenic claudication (15.9%).
The HC method revealed the largest differences (dissimilarity) in presurgical HCRU (Figure 1) between one cluster representing 74% of patients and two clusters accounting for the remaining 26% of the sample: Clust1 (n = 13,987 [74.5%]), Clust2 (n = 4270 [22.7%]), Clust3 (n = 513 [2.7%]). Average silhouette width exceeded 0.25 for two-, three-, and eight-cluster solutions (Figure 2), and within-cluster sample size declined markedly from two to three clusters (Figure 1). Although the eight-cluster solution was plausible considering average silhouette width, this solution represented smaller subgroups of clusters 2 and 3 and posed challenges for meaningful between-group differentiation. Three clusters were therefore retained for exploration within the regression model. Patients within the 3 clusters were relatively homogeneous with respect to the number of levels fused (approximately 80% one-level within each cluster, P = 0.1159) and numerically similar for surgical approach. However, patient comorbidity profiles varied significantly across clusters (Supplementary Table 1; http://links.lww.com/BRS/B465).
With respect to pharmacy claims (Table 2), the largest between-cluster ranges for mean days supplied in the preoperative year were found for antidepressants (Clust1: 97.1 days, Clust2: 175.2 days, Clust3: 287.1 days), opioids (Clust2: 166.9 days, Clust3: 129.7 days, Clust1: 76.7 days), and anticonvulsants (Clust1: 35.1 days, Clust2: 67.8 days, Clust3: 98.7 days). Medication costs for all patients by cluster category are available as Supplementary Table 2; http://links.lww.com/BRS/B465.
For medical services (Table 3 ), the largest differences in days of care were observed for behavioral health (Clust1: 0.14, Clust2: 0.88, Clust3: 16.3) and nonthoracolumbar office visits (Clust1: 7.8, Clust2: 13.4, Clust3: 13.8). When considering the costliest services in terms of both frequency of care (days of care) and unit costs (payments per procedure) during the presurgical period, several HCRU categories exhibited a broad range across clusters (Supplementary Table 3; http://links.lww.com/BRS/B465). The largest preoperative differences in payment amounts occurred for acute inpatient services unrelated to the TL spine: patients in Clust1 generated mean (SD) non-TL acute inpatient costs of $482 ($3873) versus $4421 ($13,411) for Clust2 and $1518 ($6988) for Clust3.
After controlling for patient age, diagnosis at time of surgery, comorbidities (Elixhauser category), sex, number of spinal levels fused, and surgical approach, preoperative cluster membership was associated with statistically significant differences in 2-year all-cause and TL-related postoperative costs. Mean (95% confidence interval [CI]) adjusted 2-year all-cause postoperative costs were statistically lower (P < 0.0001) for Clust1 versus Clust2 and Clust1 versus Clust3 (Clust1: $34,048 [$33,265–$34,849], Clust2: $52,505 [$50,306–$54,800], Clust3: $48,452 [$43,007–$54,790]; Figure 3). Similarly, mean (95% CI) 2-year TL-related costs were statistically lower (P < 0.0001) for Clust1 versus Clust2 and Clust1 versus Clust3 (Clust1: $13,033 [$12,545–$13,539], Clust2: $18,955 [$17,671–$20,333], Clust3: $20,203 [$16,564–$24,640]; Figure 3).
To our knowledge, this is the first study to use US commercial claims with the intent to explore and define composite (i.e., multidimensional) patterns of HCRU before spinal fusion. Our study revealed approximately one-quarter of patients (Clust2 + Clust3) to have distinct presurgery HCRU profiles characterized by greater use of antidepressants, opioids, and behavioral health services. These distinct profiles were further associated with significantly higher 2-year postsurgical payor costs. Specifically, patients in Clust2 (23% of the sample) had the highest levels of opioid use, moderate antidepressant and anticonvulsant use, moderate levels of behavioral health visits, and the highest costs (54.7% higher overall costs and 46.0% higher TL-spine-specific costs than Clust1). Patients in Clust3 comprised only 3% of the sample and were characterized by high levels of antidepressant use, anticonvulsant use, and behavioral health visits, and moderate levels of opioid use and costs (42.4% higher overall costs and 53.4% higher TL-spine specific costs than Clust1).
Various studies have sought to identify distinct predictive characteristics for clinical and economic outcomes among candidates for spinal surgery, and, consistent with the findings from this present study, psychosocial–behavioral factors are prominently featured. A prospective cohort study by Merrill et al (2018)20 found that 111 patients with depression before lumbar decompression surgery experienced a greater magnitude of improvement in physical function than nondepressed patients, but nonetheless had worse postoperative physical function, depression, and pain. Also utilizing prospectively collected data (n = 127), Walid and Zaytseva (2010)21 found significant differences in the length of stay and hospital cost between lumbar decompression and fusion patients on antidepressants and those not on antidepressants. Opioid consumption as a predictor of postsurgical outcomes has also been evaluated in contemporary studies. In a prospective evaluation of 752 patients with laminectomy and fusion for degenerative lumbar conditions, Sivaganesan et al (2019)22 found that age, body mass index, sex, diagnosis, postoperative imaging, number of operated levels, ASA grade, hypertension, arthritis, preoperative and postoperative opioid use, length of hospital stay, duration of surgery, 90-day readmission, outpatient physical/occupational therapy, inpatient rehabilitation, postoperative healthcare visits, postoperative nonopioid pain medication use, and muscle relaxant use were predictive of 90-day cost. A retrospective analysis of administrative claims (n = 24,610 patients) by Jain et al (2018)23 found that chronic opioid therapy (>6 months) before one- and two-level fusion was a risk factor for 90-day readmissions, emergency visits, and complications. An analysis of Worker's Compensation claims by Anderson et al (2015)24 revealed higher preoperative opioid load and duration of use to be associated with higher postoperative chronic opioid therapy among 1002 patients with lumbar fusion. Within 3 years after fusion, the chronic opioid therapy group had an 11.0% return to work rate, $27,952 higher medical costs per subject, 43.5% rate of psychiatric comorbidity, 16.7% rate of failed back syndrome, and 27.7% rate of additional lumbar surgery.
The previous literature reporting characteristics predictive of clinical and economic outcomes among candidates for spinal surgery suggest the complexity of the interactions. Identifying spinal fusion candidates at risk for highest costs based on combinations of risk factors may be more informative than looking at each baseline characteristic distinctly.10 Cluster analysis can be used to separate a population into homogenous subgroups based on common statistical patterns and was felt to be particularly useful in spinal fusion given the variability of patients with back pain. Cluster analysis as performed in our study, which does not define such risk factors a priori, was able to shed light on patterns of HCRU and hidden interaction effects that may exist across multiple dimensions (i.e., opioid consumption, depression, and previous surgery status).25 The identification of these characteristic subgroups with distinct HCRU patterns is important as the different groups may warrant different types of clinical interventions to improve care and to optimize healthcare resources. The cluster analysis also provided information regarding the nature and proportions of patients that might be a focus for the targeted interventions, given the effect of these high-cost users of healthcare (Pareto principle) and the transition of risk from payors to providers. Clustering is also a way of moderating the effect of Simpson paradox, wherein a subgroup relationship differs from that of the overall population.26
A key limitation of this study is that administrative claims data are not collected specifically for research purposes and lack critical information (i.e., outcomes such as health-related quality of life), thereby limiting inferences that can be drawn from the data. In addition, billing data can contain inaccuracies that are difficult to identify and eliminate. For example, ascertainment of the number of spinal levels fused requires an accurate representation of add-on CPT codes, which may be absent from some insurance claims. Furthermore, the attribution of resource consumption to distinct categories within this study was based on pre-specified coding rules and provider specifications within the IBM MarketScan Commercial database, rather than through an audit of medical charts or prospective evaluation. Other data sources such as electronic health records (EHRs) and patient and disease registries could provide additional information relevant to HCRU analysis.5 Also, the current methodology does not consider the temporal relationships among different medical events or encounters. Exploration of temporal relationships could provide deeper context and reveal more detailed utilization patterns as well as contextual attributes.5 Finally, as this dataset includes administrative claims data for patients with commercial health insurance, the findings may not be generalizable to patients with other types of healthcare coverage, in particular the elderly.
Our findings have implications for efforts to model, predict, and mitigate negative economic outcomes after spinal fusion. Aspects of care that may get overlooked in routine practice—including quantitative measures of historical interactions with the healthcare system and the intensity HCRU—may generate needed insights. Use of observed, multivariate combinations of quantitative variables may be more informative than relying solely upon prespecified binary predictive measures (such as those that indicate presence of disease).10 Learnings from studies of this type can augment clinical expertise at the point of care, given that providers can readily access information on utilization of opioids, behavioral health, and antidepressants from electronic medical records. These findings then can help to support decision-making about surgical candidacy, surgical approach (e.g., adoption of opioid-sparing techniques), setting of care, surgical timing, the provision of multidisciplinary care, and the design of postsurgery rehabilitation. Finally, tailored case management interventions aimed at optimizing health and containing costs may be brought to bear for those patients most likely to benefit.10
- The ability to distinguish among distinct subgroups (profiles) of candidates for posterior lumbar spinal fusion could generate insights needed to address unmet needs in care pathways and/or surgical techniques.
- Presurgical healthcare resource use (HCRU) reflects the interactions that patients have with the healthcare system before surgery.
- Cluster analysis was used to explore presurgical HCRU data and identify patient subgroups with characteristic patterns of HCRU.
- Three distinct patient clusters were derived from the presurgical HCRU data.
- Cluster membership associated with greater use of antidepressants, opioids, and behavioral health services were associated with significantly higher 2-year postsurgical costs.
The authors thank Natalie Edwards of Health Services Consulting Corporation, Boxborough, MA, USA for editorial assistance with the manuscript.
1. Weinstein JN, Lurie JD, Olson PR, et al. United States’ trends and regional variations in lumbar spine surgery: 1992-2003. Spine (Phila Pa 1976)
2. DeBerard MS, Masters KS, Colledge AL, et al. Presurgical biopsychosocial variables predict medical and compensation costs of lumbar fusion in Utah workers’ compensation patients. Spine J
3. Wheeler AJ, Gundy JM, DeBerard MS. Predicting compensation and medical costs of lumbar fusion patients receiving workers’ compensation in utah using presurgical biopsychosocial variables. Spine (Phila Pa 1976)
4. Luo X, Pietrobon R, Sun SX, et al. Estimates and patterns of direct health care expenditures among individuals with back pain in the United States. Spine (Phila Pa 1976)
5. Hu J, Wang F, Sun J, et al. A healthcare utilization analysis framework for hot spotting and contextual anomaly detection. AMIA Annu Symp Proc
6. Chechulin Y, Nazerian A, Rais S, et al. Predicting patients with high risk of becoming high-cost
healthcare users in Ontario (Canada). Healthc Policy
7. Billings J, Dixon J, Mijanovich T, et al. Case finding for patients at risk of readmission to hospital: development of algorithm to identify high risk patients. BMJ
8. Passias PG, Poorman GW, Bortz CA, et al. Predictors of adverse discharge disposition in adult spinal deformity and associated costs. Spine J
9. Cadarette SM, Wong L. An introduction to health care administrative data. Can J Hosp Pharm
10. Schiltz NK, Warner DF, Sun J, et al. Identifying specific combinations of multimorbidity that contribute to health care resource utilization: an analytic approach. Med Care
11. Liao M, Li Y, Kianifard F, et al. Cluster analysis
and its application to healthcare claims data: a study of end-stage renal disease patients who initiated hemodialysis. BMC Nephrol
12. Menendez ME, Neuhaus V, van Dijk CN, et al. The Elixhauser comorbidity method outperforms the Charlson index in predicting inpatient death after orthopaedic surgery. Clin Orthop Relat Res
13. Groll DL, To T, Bombardier C, et al. The development of a comorbidity index with physical function as the outcome. J Clin Epidemiol
14. Ivanova JI, Birnbaum HG, Schiller M, et al. Real-world practice patterns, health-care utilization, and costs in patients with low back pain: the long road to guideline-concordant care. Spine J
15. Saraçli S, Doğan N, Doğan İ. Comparison of hierarchical cluster analysis
methods by cophenetic correlation. Journal of Inequalities and Applications
16. Newcomer SR, Steiner JF, Bayliss EA. Identifying subgroups of complex patients with cluster analysis
. Am J Manag Care
17. Bouguettaya A, Yu Q, Liu X, et al. Efficient agglomerative hierarchical clustering. Expert Systems with Applications
18. Vendramin L, Campello RJGB, Hruschka ER. Relative clustering validity criteria: a comparative overview. Stat Anal Data Min
19. Struyf A, Hubert M, Rousseeuw P. Clustering in an object-oriented environment. J Stat Software
20. Merrill RK, Zebala LP, Peters C, et al. Impact of depression on patient-reported outcome measures after lumbar spine decompression. Spine (Phila Pa 1976)
21. Walid MS, Zaytseva NV. Prevalence of mood-altering and opioid medication use among spine surgery candidates and relationship with hospital cost
. J Clin Neurosci
22. Sivaganesan A, Chotai S, Parker SL, et al. Drivers of variability in 90-day cost
for elective laminectomy and fusion for lumbar degenerative disease. Neurosurgery
23. Jain N, Phillips FM, Weaver T, et al. Preoperative chronic opioid therapy: a risk factor for complications, readmission, continued opioid use and increased costs after one- and two-level posterior lumbar fusion. Spine (Phila Pa 1976)
24. Anderson JT, Haas AR, Percy R, et al. Chronic opioid therapy after lumbar fusion surgery for degenerative disc disease in a workers’ compensation setting. Spine (Phila Pa 1976)
25. Axén I, Bodin L, Bergström G, et al. Clustering patients on the basis of their individual course of low back pain over a six month period. BMC Musculoskelet Disord
26. Kievit RA, Frankenhuis WE, Waldorp LJ, et al. Simpson's paradox in psychological science: a practical guide. Front Psychol