Up to 84% of the general population has been found to report low back pain (LBP) symptoms at some point during their lifetime.1 Back pain is the second most common symptom-related reason for clinician visits in the United States.2 Yet, a high level of variability in practice patterns suggests much clinician uncertainty about how to treat LBP. Indeed, despite sharp increases in the use of various treatments, ranging from imaging, opioid pain medications, injections, and complementary/alternative medicine to surgery, there is no clear evidence reported to date of improved functional status and declining work disability related to LBP.3
These developments have raised LBP as a priority for employers focusing on workforce health and productivity (H&P). These employers are seeking to maximize the ratio of outcomes achieved relative to costs incurred (ie, value) for the investments that they are making in their employees (EEs). This search includes ways that preserve the viability of these investments for the future (ie, sustainability).4 It is leading these employers to identify targets for a broadening array of measurement and management initiatives. Low back pain is becoming one such target because of the evidence emerging from recent studies. Across work that has ranged from estimating the unique contributions of LBP5 to head-to-head comparisons with other conditions,6 the magnitude of the total health burden, that is, the health care transactions driving direct costs and the lost productivity, short-term disability, long-term disability, and workers' compensation (WC) driving indirect costs, with which LBP is being linked has become a source of concern.
For these employers, the previously described picture is serving to reveal their “skin in the game” because clinicians and professional societies disseminate guidelines for improving the effectiveness and reducing the variability and costs of LBP treatment. A key contribution that these employers can make—not only to the continuous quality improvement of these guidelines but also to the refinement of their own H&P approaches—is the cultivation of evidence as to what is actually taking place on the ground.7 What treatment choices do EEs and their clinicians make? Do these decisions lead to differences in outcomes that are traceable to these choices? To what extent does congruence with guidelines alter the effect of LBP management choices on EE/patient outcomes?
An employer with well-developed and proven evidence-gathering and analytical capabilities is uniquely positioned to tackle these questions. Herein, we report a study conducted to obtain initial answers using an integrated database for a major self-insured employer with an accumulated track record of articles addressing a diversity of H&P issues in this journal.8–17
TREATING LOW BACK PAIN
Most patients (greater than 85%) seen in primary care for LBP have “nonspecific low back pain,” which cannot be attributed to pathophysiological causes, while the balance are linked to a specific disease or spinal pathology (eg, radiculopathy).18 Although the long-term prognosis for LBP is generally favorable—with the norm being rapid improvement in pain, disability, and return to work in the first month and further improvement that often planes out at the 3-month mark—the duration of the condition and its effect can be substantial. In a large prospective study, although 90% of general practice patients with LBP stopped seeing their doctors within 3 months, only 25% were reported to be completely free of pain and disability after 1 year.19
Consistent with the generally favorable long-term prognosis for the LBP, roughly only a third of patients seek medical care, with the balance apparently improving on their own.3 Of the former, one third are substantially improved after 1 week and two thirds after 7 weeks.20 Even so, recurrences may affect 40% to 50% of patients within 6 months21 and 70% within 12 months.20 For many, the natural history is akin to chronic conditions such as asthma, marked by chronic mild symptoms and intermittent exacerbations. In one prospective study of patients with acute LBP, chronic LBP was diagnosed for 20% of patients within 2 years of their first visit.22
These features have led to the differentiation of three categories for classifying the duration of treatments for back pain in general and LBP specifically. Acute LBP refers to LBP that lasts 4 weeks or less, subacute LBP refers to LBP that lasts between 4 and 12 weeks, and chronic LBP refers to LBP that persists longer than 12 weeks. Wide variations in the use of diagnostic tests, pain medications, injections, and surgery suggest much clinician variability about optimal therapy for each category. Indeed, many patients with chronic LBP may not be receiving evidence-based care. A recent study, for example, linked 732 North Carolina adults with chronic LBP to overuse of unproven interventions (eg, traction), high use of second-line medications (eg, opioids), and underuse of exercise therapy and, for depressed patients, antidepressants.23
Such outcomes have led to growing worries that the condition has become “overmedicalized.”24 Pertinent evidence is emerging in several areas, for example, as follows:
1. Early use of magnetic resonance imaging (MRI) has been linked to prolonged disability, higher medical costs, and greater use of surgery21,25 and to reviews finding no benefit on health, function of disability outcomes for LBP.26
2. Patients receiving chiropractic care have been found to record lower associations of probability of disability recurrence than patients of physicians and physical therapists.27 On the contrary, chiropractic care with shorter duration has also been found to be associated with lower rates of disability recurrence and shorter disability duration.28
The frequency and the strength of opioid dosage in LBP treatment have been positively linked to claim duration.29 Likewise, the odds ratios for a catastrophic claim (total cost of $100,000 or more) when spinal surgical procedures were performed increased 10-fold when treatment included opioid use.30 Yet, close provider supervision can apparently have mitigating effects, as suggested by a third study, which found that each additional week between the filling of opioid prescriptions predicted 14% longer disability and time off work.31
To promote effective and efficient care for LBP patients, clinical guidelines have been published in the United States32 and Europe.33 Country-specific guidelines have also surfaced.34–36 These encompass recommendations for diagnosis, screening, care pathways, and treatment algorithms and, in general, specify that the criteria for evaluation (eg, ordering diagnostic studies37) and treatment (eg, performing surgeries38) for patients with possible work-related LBP should not be different from the criteria for patients with nonoccupational LBP.
A synthesized reading of these guidelines suggests that they have elements that can be examined using the criteria pertinent to employers and their capacity to manage the value/sustainability challenge that they face as a stakeholder group. A key issue is the direct and indirect cost outcomes of the care that transpires under EE benefit coverage whose cost they and their EEs share. Although the delivery of care and its comportment with guidelines are beyond the province of employers who look to providers and payers to assume such responsibilities, outcomes are of keen interest. In particular, what is the effect on cost outcomes when care is incongruent with guidelines?
As with other clinical areas, guideline development and implementation for LBP continue to be marked by much flux and lack of consensus. Yet, when evaluating care and total cost outcomes for employer purposes, it matters less that the guidelines selected be the best available in terms of performance on scientific criteria. What matters more is that these guidelines are (1) sufficiently grounded in empirical work that they can serve as a credible foundation for this partner dialogue and collaboration and (2) readily testable in the context of the data and analytical resources to which employers have access. The following four areas taken from the recently published literature on LBP guidelines focus on timing aspects that meet these two criteria.
Barring progressive neurological findings or suspicions of systematic etiology, routine x-rays have been recommended only after 4 weeks postonset if clinical improvement has not occurred.39 Similarly, barring evidence at clinical examination of emergent conditions (eg, cauda equina syndrome), MRIs or computed tomographic (CT) scans may be appropriate after a minimum of 4 weeks of radicular symptoms.39 Otherwise, excluding prior indications of progressive neurological deficits or a high suspicion of cancer or infection, CTs or MRIs have been recommended only after 12 weeks of persistent back pain.2
The use of short-term therapeutic courses of manipulation treatment may likewise be indicated, but guidelines have proposed that physical therapy (PT) referrals should not be made within—at the least–-the first 2 weeks of onset.3
For patients with specific LBP etiology (eg, lumbar radiculopathy), a waiting period of at least 6 weeks has been recommended before they are to be considered candidates for surgery. For nonspecific LBP, surgery has been recommended only for patients who have experienced persistent symptoms and associated disability for at least 1 year despite nonsurgical interventions.40
Prescription (Rx) medications are of particular relevance to care for patients with acute LBP, where the focus for therapy is on short-term treatment for temporary symptomatic relief to maximize patient comfort, although they are also applicable to subacute and chronic LBP cases.3 Opioids, nonsteroidal anti-inflammatory drugs (NSAIDs), other pain medications, muscle relaxants, anxiolytics, and sedatives prescribed as monotherapy on a fixed schedule or in appropriate combinations have all been the subject of evidence assessment and recommendations in this regard.41 In general, timeframes used in recent LBP clinical trials have linked “short term” for these pharmacologic therapies to use not exceeding 12 weeks, although for benzodiazepines and muscle relaxants, short term has been linked to 4 weeks.41
Previous work on the integrated database tapped for this study showed its capacity to identify specified patient groups and monitor their direct and indirect cost outcomes during 2001 to 2009. This study built on this precedent by examining the actuarial consequences of clinical treatment choices for back problems, using the previously described guidelines to focus on LBP. The objectives were the following:
1. Identify all active EEs reporting a back problem diagnosis during the study period.
2. Define and classify their initial patterns of medical care and Rx medication use.
3. Track the effect of these patterns on direct and indirect cost outcomes.
4. Further stratify these treatment patterns by measures of congruence with the previously described guideline aspects for LBP care and determine the effect on cost outcomes.
Study Design, Company, and Data Sources
This study was a retrospective time-series analysis of an extract taken from an integrated database focusing on the continental US active EEs of an international heavy manufacturer of transportation equipment. In 2009, this company was ranked 175th on the Fortune 500 list.
This database tracks measures of employee personnel characteristics as well as medical, behavioral health, pharmaceutical, WC, disability, absenteeism, and lost productivity during the 2001 to 2009 period. The sources tapped for these measures include health plan, pharmaceutical benefit manager and behavioral health manager claims obtained through the company's data manager vendor; WC, short-term disability and long-term disability data obtained from the company's WC/disability vendor via the company; payroll and human resource data obtained from the company's human resources department; and three rounds of “special topic” company-wide employee surveys administered by an external vendor using an electronic/manual approach in the spring and fall of 2001 and the fall of 2009. In 2011, this database was approved by the Western Institutional Review Board for compliance with the Health Insurance Portability and Accountability Act.
As reported elsewhere,15 this workforce is largely older, male, and hourly, and averages relatively long company tenures and relatively mid-level annual income. But, it became significantly less so on each of these characteristics from 2001–2002 to 2008–2009: average age dropped from 48.5 to 46.2 years; percentage male, from 83.2% to 77.6%; percentage hourly, from 64.9% to 53.6%; average tenure, from 20.2 to 14.5 years; and average income, from $66,791 to $64,822 (2009 dollars).
All EEs in the extract with at least 6 months of recorded continuous coverage on a company health plan during the study period were considered eligible for selection into the disease sample unless they developed a cancer diagnosis, retired, or died, after which they were dropped from the sample. A total of 21,080 individuals met these initial requirements. Employees with back pain were then identified on the basis of Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM)42 diagnoses reported in medical claims. Back pain status was differentiated by the presence of one or more eligible diagnoses on one or more medical claims lines. Those identified were grouped as follows:
#1: LBP with neurologic findings (LBP/neuro): ICD-9-CM codes 721.42, 721.91, 722.73, 722.80, 724.3, 724.4 (ie, diagnoses indicative of specific etiology; n = 1837)
#2: LBP with no neurologic findings (LBP/nonneuro): ICD-9-CM codes 724.2, 724.5, 846.0-9, 847.2, 847.3, 847.9 (ie, diagnoses indicative of nonspecific etiology; n = 8569)
#3: Other back: all ICD-9-CM codes in categories 13.3.-1-3 of the Clinical Classifications Software for the Healthcare Cost and Utilization Project43 (CCS-HCUP) other than those in #1 and #2, including cervical and thoracic back pain diagnoses and all codes in the 720 to 724 range not listed previously (n = 4381).
Of the 21,080 EEs, 8300 (39.4%) had at least one back pain episode and many had more than one. Table 1 describes these three groups on measures ranging from demographic and job characteristics to health and utilization characteristics 6 months prior to disease onset, several of which were assessed by survey in spring 2001 and in fall 2009. All three groups were on average older, more likely to be hourly, and slightly more likely to be female than the aggregate sample. They were also likely to have somewhat longer job tenure but less annual income.
Units of Analysis
The units of analysis were episodes. An episode was defined by a period of medical claims with back pain diagnoses (all types, inclusive), separated from previous episodes by at least 6 months without a back pain claim, and followed by at least 6 months without a back pain claim. After the episode was defined in this way, it was classified into one of the three groups described previously. If any claim in the episode had an LBP/neuro or LBP/nonneuro diagnosis, the episode was classified as such based on the first diagnosis reported. If there was never any such diagnosis in the stream of diagnoses listed, the episode was classified in the other back group.
A total of 14,787 back pain episodes were identified, of which 1837 were LBP/neuro, 8569 were LBP/nonneuro, and 4381 were other back. These counts included episodes that were left censored, whose first observation date was unclear because there was less than 6 months of enrolled history. With left-censored episodes excluded, there were 13,224 back pain episodes, of which 1623 were LBP/neuro, 7650 were LBP/nonneuro, and 3951 were other back. These counts included right-censored episodes, whose observation period ended too early (less than 6 months before the end of enrollment or the first cancer date) to know if the episode had truly ended.
Episode lengths were short. With left- and right-censored episodes included, the median LBP/neuro episode lasted 42 days, whereas the medians for LBP/nonneuro and other back episodes were 6 and 0 days, respectively (the group means were 148, 91.3, and 51.4 days, respectively). The breakdown by episode duration was as follows:
1. LBP/neuro: 44% acute, 18% subacute, and 38% chronic
2. LBP/nonneuro: 61% acute, 12% subacute, and 27% chronic
3. Other back: 70% acute, 13% subacute, and 17% chronic
The vast majority of episodes were defined on the first day of the episode. Of the LBP/neuro episodes, 84.6% were defined on the first day compared with 90.4% for LBP/nonneuro. By construction, all other back episodes were defined on the first day.
Of the LBP/neuro diagnoses, 52.3% were lumbosacral neuritis not otherwise specified, and 44.0% were sciatica. None of the other diagnoses in this category recorded greater than 1.9% prevalence. Of the LBP/nonneuro diagnoses, 52.2% were lumbago, 26.6% were backache not otherwise specified, 12.1% were sprains in the lumbar region, and 5.1% were lumbosacral sprains. None of the other diagnoses in this second category had greater than 2.2% prevalence.
Exploratory tests suggested notable “bleed” across these diagnostic groups over time. For those with an initial LBP/neuro diagnosis, 40% received an LBP/nonneuro diagnosis and 58% received another back diagnosis by 90 days after the first visit. For those with an initial LBP/nonneuro diagnosis, 6.8% received an LBP/neuro diagnosis and 35% received an other back diagnosis. By definition, all episodes initially classified as other back reported no additional LBP/neuro or LBP/nonneuro diagnoses in the first 90 days.
In this context, only the 40% and 6.8% had substantive meaning because the other back designation included many episodes that started with a noncommittal diagnosis defined only after some investigation or episodes in which the diagnosis associated with some procedures was ambiguous with respect to etiology. Nonetheless, this extent of change was not unexpected. With respect to LBP specifically, a key function of the initial clinical evaluation is to enable patients to be triaged appropriately by the type of diagnosis.2 Yet, the use at initial visits for LBP of commonly recommended “red flag” questions developed from decision rules designed to detect malignancy, infection, cauda equina syndrome, fracture, and inflammatory disorders has been linked to high false-positive rates. One recent study of 1172 patients at the time of first consultation for new episodes of LBP in primary care settings, for example, linked 80% to one or more 25 “red flag” symptoms when the detected incidence of spinal fracture and malignancy turned out to be 0.7% and 0.0%, respectively.44 Most significant pathologies will tend to become more clinically obvious over time because of either the clinical course or exacerbations stemming from usual care for nonspecific LBP.
Procedures and Medications
Procedures for this study were identified on the basis of the Current Procedural Terminology-4 fields holding either Healthcare Common Procedure Coding System codes or ICD-9-CM procedure codes found in medical claims. These procedures were counted as back-pain related if one of the diagnoses in the corresponding claims line item aligned with one of the diagnoses that made up the previously described definition of back pain. A total of 215,295 procedures were reported as such for the three back pain groups during the 9-year period. For these, the following categories were identified:
1. Physician or staff visit: evaluation and management; staff/technician (eg, laboratory team members like phlebotomists); mental/behavioral health; emergency department visit
2. Surgery and injections: major (invasive) primary low-back-related surgery; nerve blocks; and other injections and palliative procedures
3. Imaging: ultrasound; CT scan; MRI; other x-ray or imaging, including dye contrast
4. Chiropractor or PT visits
5. Other: hospital; ambulance
Medications were identified on the basis of the national drug category and ingredient entries found in pharmaceutical claims. In contrast to procedures, these entries were not attached to back pain diagnoses specifically, and additional tests were, therefore, required to make these linkages. Specifically, these entries were grouped into drug classes that (1) the American Hospital Formulary Service Pharmacologic-Therapeutic Classification System45 has categorized as approved for the treatment of central nervous system conditions (cf American Hospital Formulary Service class number 28:00 Central Nervous System Agents) and (2) were observed as having Rx fill patterns that covaried with the date of initial back pain diagnosis.
In the absence of knowing what was actually written for back pain, tests of this covariation were conducted to examine the timing of Rx fills relative to the episode start date. Consider opioids, for which it was expected that Rx filling would be highly related to episode starts, and topical steroids, for which little such association was expected. During the period from 9 days prior and 30 days after episode start date, 52% of all Rxs for opioids were filled either on or up to 3 days after the start date. The corresponding number for topical steroids was less than 20%. Thus, opioids showed a strong relationship to episode start date and were included as a medication class in the analyses. Topical steroids showed a much weaker association and were discarded as a class for these analyses. (Full results for these assessments are available on request.)
The total number of prescribed medications filled by three back pain groups sample across the 9-year study period was 44,100. Of these, the following classes of medications showed sufficient covariation to be included in the analyses:
1. Opioids: short acting; long acting
2. Nonsteroidal anti-inflammatory drugs, including COX-2 selective inhibitors
3. Other pain medications (eg, antiepileptic)
4. Muscle relaxants
5. Selective serotonin reuptake inhibitors (SSRIs; eg, fluoxetine), serotonin norepinephrine reuptake inhibitors (SNRIs; eg, duloxetine), and tricyclics (eg, amitriptyline)
6. Oral steroids (eg, prednisone and methylprednisolone)
7. Anxiolytics, sedatives, and hypnotics (eg, benzodiazepines and barbiturates)
Table 2 gives the percentage of episodes, reporting the use of these procedures (with a pain diagnosis) and medications (with or without a diagnosis) by episode type within timeframes pertinent to LBP guidelines—by the 14th day (ie, from episode start date through day 14); by the 28th day; by the 90th day; by the 365th day; and after the 365th day. As shown, the more costly, complex procedures like major surgery, injections, and MRIs tended to be used at greater rates earlier on in LBP/neuro episodes. Rates of emergency department use were highest among LBP/nonneuro episodes. Rates of medication use for both LBP types tended to be on par and greater than those for the other back group. Otherwise, rates across the three episode types tended to be comparable.
The ultimate outcome for this study was annualized total costs per EE, which summed the following direct and indirect cost components (all expressed in 2009 dollars):
1. Direct: Inpatient admits, outpatient visits, emergency department visits, behavioral visits, laboratory tests, equipment orders, doctor's office injections (from medical claims), and prescriptions filled (from Rx claims). All components included the shares borne by the company and EEs, with the latter including all EE payouts except plan premiums. Dental care was excluded.
2. Indirect: WC, short- and long-term disability and “controllable absenteeism” (ie, time away from work for individual health reasons). The dollar amounts lost to absenteeism were derived from the company's pay check and job history records.
The total costs reported in a previous study on this database, it can be noted, included lost productivity because of presenteeism. Presenteeism was not included in this study because of the complexities that it would have introduced (eg, the claims-based episode start dates spanning 2001 through 2009 vs survey-based estimates pertaining to 2001 and 2009 only). With presenteeism excluded, this previous report recorded annual total costs of $7859 and $5869 per employee (unadjusted in 2009 dollars) for the 2001–2002 and 2008–2009 timeframes, respectively.15
Analyses were first undertaken to classify the three types of back pain episodes by approach to medical treatment on the basis of claims activity. This classification was predicated on which providers were seen and what procedures were received during the first 6 weeks after episode start date. To investigate the extent to which the resulting taxonomy was merely a tautological reflection of case mix, the five initial treatment approaches thus identified were examined in relation to the previous treatment patterns with controls for episode type and duration. The relationships between these approaches and initial medication use were also explored.
Then, guidelines were brought into the mix with measures defined in terms of incongruence vis-à-vis aspects drawn from the aforementioned literature synthesis. These guideline measures focused on LBP episodes and were likewise examined in relation to initial treatment patterns. Last, the effect of initial treatment pattern (for all three back problem groups) and guideline incongruence (for the two LBP groups only) on cost outcomes was assessed. The longitudinal data set enabled these probes to be conducted across years 1, 2, and 3 of episodes.
To make adjustments, logistic regression was used to calculate probability scores on the basis of the regression of binary treatment variables on predictors that might make the selection of treatments nonrandom. Specifically, propensity-scoring techniques that others46–48 have developed to adapt Rubin's approach to multiple treatment choice scenarios were used to adjust for the following: age, sex, episode type and start date year, comorbidities, the total number of claims lines, had prior episode, and left-censored episode status (details available on request).
Although discrepancies between the unadjusted and adjusted results were of interest, the adjusted version was chosen as the basis for hypothesis testing because of its presumed greater accuracy in this context. These tests focused on two general hypotheses. First, greater rates of guideline incongruence will be associated with greater total cost outcomes over time. Second, the positive relationship between initial treatment patterns that are more medically complex in orientation and total costs will be acerbated by greater rates of guideline incongruence. All analyses used the STATA 9 software program (STATA Corp, College Station, TX).
Initial Treatment Patterns
The first 6 weeks of medical claims were searched for patterns that reflected the presence or absence of clear treatment choices. Counts of the various procedures found were used to establish the following five overall patterns (overall sample percentage in parentheses):
1. Information and Advice (TalkInfo): Episodes whose procedures during the first 6 weeks reflected information gathering or advice seeking but otherwise no overriding pattern: simple office visits, laboratory tests, emergency department or hospital visits, talk therapy, or visits involving imaging (x-ray, ultrasound, CT, or MRI) but no other procedures (59%).
2. Complex Medical Management (Complex MM): Episodes in which the number of physician visits for nerve blocks, surgeries, or comparable procedures was greater than 1 and comprised at least the plurality of treatments. Any ties with other categories went to this category (2%).
3. Chiropractic (Chiro): Episodes in which the number of visits to a chiropractor was greater than 1 and comprised the plurality or greater of procedures. This included cases involving manipulation billed as PT if the manipulation occurred on the same day (11%).
4. Physical therapy (PT): Episodes in which the number of visits to a PT was greater than 1 and comprised the plurality or greater of procedures. Physical therapy by itself (no chiropractor) sometimes included devices or other palliative treatments (11%).
5. Dabble: Episodes with at most one visit for physician, chiropractic, or PT care, or at most one visit to two or more of these categories (17%).
Across the five patterns, when selected, the Complex MM, PT, and Chiro approaches were most likely to be initiated in LBP/neuro cases, whereas the TalkInfo approach was least likely to be initiated in LBP/neuro cases (Table 3). In contrast, the TalkInfo approach was most likely to be initiated in LBP/nonneuro cases, whereas the Dabble approach was slightly more likely to be initiated in other back cases. Breakouts by episode duration showed little change in these patterns across acute, subacute, and chronic cases within each episode type.
Of the five patterns, the Complex MM approach was linked to the highest rates of Rx fills for four of the seven drug classes—opioids, other pain medications, SSRI/SNRI/tricyclics, and anxiolytics/sedatives/hypnotics (Table 4). The PT group was highest in NSAIDs, muscle relaxants, and oral steroids, whereas the Chiro group had the lowest Rx rates in all seven classes.
Treatment Choices and Costs
These differences in rates of medication use and the longitudinal differences in rates of use of complex procedures noted previously—both of which linked the more intensive treatment approaches to LBP/neuro cases—raised the question, to what extent was the fivefold taxonomy of treatment approaches a result of the type and duration of episodes? If the approaches were merely a reflection of the presenting diagnosis and length of the subsequent treatment incurred, then the taxonomy could be seen as a construction of circular reasoning, entirely reducible to the ICD-9 and the Current Procedural Terminology-4 codes used to construct it and of little explanatory value in its own right.
Yet, this turned out not to be the case. When those EEs with more than one episode were examined, their previous initial treatment choice was highly predictive of their current initial treatment choice. In fact, they tended to start out by choosing the same approach that they had previously—those who had selected the chiropractic approach previously were most likely to again select the chiropractic approach for the current episode, those who had previously selected the PT and Complex MM approaches again selected the PT and Complex MM approaches, etc (Table 5). The selection effects remained intact even when episode type and duration were controlled. The initial treatment choices made by EEs and their providers exerted systematic influences above and beyond the presenting clinical features of the episodes that ensued.
These choices, in turn, had cost consequences. The Complex MM approach recorded the highest total unadjusted costs for the three back pain groups (Table 6), especially in year 1 but remained as such in years 2 and 3. The Chiro approach, in contrast, was consistently linked to the lowest costs. Its per-employee total stayed at the lowest level across all three groups for each year. Physical therapy was associated with relatively high costs in year 1 but tapered off in years 2 and 3.
Similarly, medication choices had cost consequences. With rare exception, medication use within 4 weeks of episode start date was linked to higher unadjusted cost totals than no use. Subtracting the “yes” and “no” rows for each drug class measure in Table 6 gives the differentials. These differentials were the largest in the first year but generally stayed intact through year 3 for all three groups. The greatest differentials tended to occur in the other pain class, although the differentials for SSRI/SNRI/tricyclics and opioids were nearly equivalent.
The analyses of guideline aspects were conducted on the LBP/neuro and LBP/nonneuro groups only (n = 10,406) because the guidelines dealt with care for LBP specifically. The 11 measures selected for assessment are given in Appendix 1 (see Supplemental Digital Content, available at http://links.lww.com/JOM/A162). Six focused on procedures—three on imaging, one on providers, and two on surgeries—whereas five focused on medications, with the guidelines for two procedures (MRI use and surgeries) distinguishing between LBP types. Each of these measures addressed a timing issue for the procedure or medication referenced, with incongruence scored high and congruence low. In each case, the intent was to set a relatively conservative criterion in terms, enabling it to be applied broadly with little fine-tuning.
The rates of guideline incongruence across both LBP groups proved substantial. Across each of the six procedure aspects, the rate of incongruent “hits” was significantly greater for low back/neuro cases than for low back/nonneuro cases (Table 7), especially with respect to imaging. At more than 24%, for example, the LBP/neuro group's rate of receiving MRIs at or before the 28-day mark doubled LBP/nonneuro's rate. Breakouts by episode duration showed differentials in the same direction and of similar magnitude in this pattern vis-à-vis acute, subacute, and chronic cases across both LBP episode types (Table 7).
Comparable differences between the two types—all in the 40% to 60% range and statistically significant—were found on the “PT/Chiro visits within 2 weeks” and surgery measures. Splits by duration found that 13.3% of chronic LBP/neuro episodes reported surgery on or before the 365-day mark compared with 6.6% for chronic LBP/nonneuro cases. Of note, these latter cases were incongruent with the surgical guideline.
Rates of incongruent medication use, in contrast, were more similar across the two LBP types. LBP/neuro's rate of incongruent Rx filling was modestly significantly greater for opioids, NSAIDs, and other pain medications, whereas the LBP/nonneuro's rate on muscle relaxants was slightly greater. All group differences between the two types did not exceed 3%, with one exception—a 5% difference in the rate of incongruent opioid use among subacute cases. Roughly, a fifth of opioid and NSAID use continued beyond the 84-day mark across both types.
Initial Treatment Pattern Effects
Across the five patterns, the greatest rates of incongruence were posted by the Complex MM approach (Table 8). Notwithstanding the small number of cases on which these rates for the Complex MM approach were calculated, the consistency of their greater magnitude relative to that for the other approaches was striking.
The 70% of Complex MM cases reporting their first x-ray on or before the 28-day mark and the nearly 52% reporting their first MRI or CT on or before the 28-day mark were both a highly significant 19+ percentage points greater than the next highest approach, PT. Checks across type and duration of LBP episode replicated the direction and size of these differentials. Similarly, the 58% of Complex MM cases reporting their first MRI or CT on or before the 84-day mark were a highly significant 22 percentage points greater than PT. This same difference held when the test included only LBP/nonneuro cases, for whom the 84-day period was specifically set.
Especially striking were the surgical results, where the Complex MM's rates of guideline incongruence far exceeded the other approaches (Table 8). Of note, 43% of LBP/nonneuro cases linked to Complex MM reported major back surgeries on or before the 365th day, the waiting period set for this type of back pain. Similarly, almost 24% of eligible LBP/neuro cases reported major back surgeries on or before the 42nd day, the waiting period set for this other LBP type. By contrast, the surgical rates for the other approaches were in the low single digits.
The Complex MM approach also recorded the greatest incongruent rates in the medication area. More than 58% of Complex MM cases reported a timeline-excessive course of therapy on at least one of the five measures, with opioids and NSAIDs being the two classes with the greatest rates. This compared to more than 48% of cases reporting at least one timeline-excessive course of therapy for the PT approach, which otherwise reported a similar medication profile.
The one guideline aspect where the Complex MM approach posted the lowest score was the rate of Chiro/PT visits—the measure on which Chiro and PT posted the highest rates. More than 95% of all Chiro cases and nearly half of all PT cases reported their first such visit within 14 days of the start date. Equally striking were the results at the other end of the spectrum. TalkInfo and Complex MM approaches, and to a lesser extent the Dabble approach, all tended to record lower rates.
Similarly, with one exception, higher rates of guideline incongruence were linked to greater costs (Table 9). For both LBP groups, considerable increments differentiated the average total costs of episodes reporting incongruence with one or more of the 11 guideline aspects relative to those episodes reporting no incongruence at year 1. In years 2 and 3, these increments tapered off somewhat but remained substantial.
Incongruence on the major surgery measures led the way in generating these cost differentials for both LBP groups, although both imaging measures produced notable differentials in the same direction (Table 9). Incongruence on the medication measures reported similar cost differentials, consistently at levels that proved even more sustained over time. The lone exception occurred on the “PT/Chiro within 2 weeks” measure, where episodes that reported the first visit within the 2-week mark posted, on average, notably less costs than those episodes that refrained from having the first visit within the 2-week mark. On this measure as well, comparable differentials were recorded for both types of episodes across all 3 years.
The propensity-derived adjustments for hypothesis testing were framed to yield two types of comparisons (Table 10). For initial treatment pattern predictions, each pattern other than TalkInfo was assessed relative to the TalkInfo (the holdout group). For guideline aspect predictions, the comparisons assessed incongruent relative to congruent episodes.
Considered altogether, the adjusted results were highly similar to the unadjusted results for both sets of predictors (Table 6). Comparisons with the previously described corresponding results indicated that the adjustments for age, sex, episode type and start date year, claims lines, comorbidities, prior episode, and left-censored episode status via the propensity techniques made little difference in the direction and magnitude of the estimates relative to the unadjusted totals.
Of the initial treatment patterns, the total costs linked to the Complex MM approach far and away outstripped the total costs for the other approaches, particularly in year 1 but also in years 2 and 3 (Table 10). In year 1, the total costs of the average Complex MM approach episode exceeded the average TalkInfo episode by just more than $14,000 for LBP/neuro cases and more than $17,000 for LBP/nonneuro cases. The corresponding total costs for the next-ranked approach, PT, were approximately $5000 and $3600, respectively. In year 3, the average Complex MM episode still incurred the highest total cost, exceeding the average TalkInfo episode by nearly $2000 for LBP/neuro and nearly $1800 for LBP/nonneuro. Physical therapy again ranked second for both types.
At the other end of the cost continuum, the Chiro approach alone averaged lower total costs than the TalkInfo approach for both types of LBP across all 3 years. Deficits for Chiro relative to the TalkInfo approach ranged from just more than $1700 in year 1 to just more than $1300 in year 3 per episode for LBP/neuro and from $600 to nearly $200 per episode of LBP/nonneuro cases. The Dabble approach posted totals in the mid-ranges across all 3 years for both types.
The adjusted cost totals for the guideline comparisons confirmed one set of clues to these differences (Table 10)—the considerable costs linked to incongruence on one or more of the 10 imaging, surgical, and medication guideline aspects selected for this analysis. The year 1 average total for LBP/neuro episodes was nearly $5500 more per episode for when they reported incongruence with one or more of the 10 imaging, surgical, or medication aspects than those LBP/neuro episodes reporting no such incongruence. This differential remained intact with episodes incongruent on 1+ aspects averaging approximately $1600 more in years 2 and 3. The corresponding totals for LBP/nonneuro episodes were just more than $3200, $1200, and $920 per incongruent episode in years 1, 2, and 3, respectively. The incongruent use of surgeries, imaging, and medications each contributed significantly to these totals in the first year across both types of episodes and in many cases across years 2 and 3 as well. The “PT/Chiro within 2 weeks” aspect again proved to be the lone exception, with differentials linking strikingly lower total costs to incongruent use being recorded on both LBP/neuro and LBP/nonneuro episodes across all 3 years. Rerunning these analyses to test 2 refinements—1) examining first episodes only and 2) lengthening the “before” and “after” intervals from 6 to 12 months—did not alter these conclusions (results available on request).
The other set of clues for the initial treatment pattern cost differences came from the comparisons above linking initial treatment patterns with guideline aspect incongruence rates (Table 8). As noted previously, the Complex MM approach was the most likely to record incongruent imaging, surgery and medication episodes, followed by the PT approach. Across guideline aspects, the lone exception came on the “Chiro/PT visits within 2 weeks” measure, which had the Chiro approach and then the PT approach recording the highest rates.
Combined, these results strongly affirmed both hypotheses. Not only did incongruence on 10 of 11 guideline aspects each show appreciable levels of prevalence and strikingly positive, substantial and lingering relationships with total costs, as per hypothesis #1 (summarized in Fig. 1); the treatment approaches most likely to exhibit incongruent use of imaging, surgery, and medications—Complex MM, then PT—were, in turn, associated with the greatest total costs per hypothesis #2. As summarized in Figures 2 and 3, these treatment approach effects stayed intact across breakouts for acute, subacute, and chronic episodes. The lack of sensitivity to propensity controls underscored the robustness of these findings.
This study—in effect, an exploration of the actuarial consequences of clinical decision making–-suggests several conclusions for those proactive employers and occupational health personnel that are turning to LBP to improve its care:
1. Back pain is costly, regardless of the type of back pain EEs have or whether they seek treatment for it. It matters little whether the problem with which EEs present is in the low back or cervical or thoracic spine or of neurological or nonneurological LBP origin. And, it matters little whether their care starts out by simply seeking information and advice, or by launching into a sustained course of treatment from physicians, chiropractors, or physical therapists, or by exploring two or more of these types of providers. On average, their direct health care and indirect lost productivity costs will total in excess of $4000 by the end of the episode's first year and continue—albeit at levels that taper off—across years 2 and 3.
2. The initial treatment choices that back pain EEs/patients make in concert with their providers matter. Previous treatment choices they have made will influence their current treatment choice, above and beyond the type of back pain diagnosis they receive or the length of the episode that ensues. These choices, in turn, can have a major differential effect on the total cost of their episode, particularly when these treatment choices are not congruent with certain timing-related aspects of current guidelines for the use of imaging, surgeries, and medications. This finding aligns with the Choosing Wisely initiative of the ABIM Foundation, which encourages physicians, patients, and other health care stakeholders to discuss and utilize the available evidence when determining an individual patient's treatment choices.49
3. Of the initial treatment patterns, the Complex MM approach will, on average, be associated with the greatest total cost—not only in the first year of episodes but also in years 2 and 3—in significant part because the providers practicing the approach are more likely to deploy the expensive medical options of the clinical repertoire in ways that are guideline-incongruent. Those EEs who access the PT approach will likewise be associated with higher costs if they persist with treatment although a first visit to a PT within the initial 14 days may have cost-reducing effects. The EEs accessing the Chiro approach will tend to be the least expensive because they are less likely to be prescribed medications or end up with complex medical procedures and because they are less likely to record guideline-incongruent use of imaging, procedures, and medications when the latter are delivered.
4. Of all treatment components for LBP, surgery exerts the most potent cost effects. These costs not only are concentrated in year 1 but also can linger into years 2 and 3. Surgeries stand out, especially in this regard when conducted prior to the end of the waiting periods recommended by the pertinent guidelines. Of all the procedures assessed in relation to guideline aspects pertaining to LBP/neuro episodes that were examined in this study, surgeries prior to the 6-week mark accumulated the greatest total cost, averaging $33,519 across years 1, 2, and 3 combined. Similarly, of all the procedures examined in relation to guidelines pertaining to LBP/nonneuro episodes, surgeries on or before 365 days accumulated the greatest total cost, averaging $41,580 across the 3 years.
5. Individuals-prescribed pain-related medications are more costly, and these expenses can linger into the future if prescription filling is not congruent recommended guidelines for short-term use. Although much has been written recently about opioids in this regard, this pattern holds for all five Rx classes assessed in relation to guidelines in this study. The potential for improved management of this pattern and its costs holds for all three types of back pain examined here.
6. Although guidelines are structured to reflect the latest thinking and evidence on which processes are most efficient and effective at delivering on average the best patient outcomes at the lowest cost, it pays to subject their implementation to empirical scrutiny. Not all the expectations generated may prove accurate. In 1 of the 11 instances examined here, where the recommendation is that the first visit to PT or chiropractor should be made at least 2 weeks after the episode start date, incongruent care actually led to significantly less total cost. Although this result may not necessarily warrant a revisiting of this particular guideline, it merits the attention of all stakeholder groups with an interest in optimizing its usefulness.
The process used to generate the evidence for these conclusions requires some caveats. Consider, for example, the breakouts by episode duration reported in Table 6, which found notable percentages of guideline-incongruent medication use reported beyond the timeframes specified for acute and subacute episodes. For instance, opioid Rx filling both before and after the 84-day mark was linked to just more than 14% of acute LBP/neuro cases, which by definition last up to 28 days, and just more than 23% of subacute LBP/nonneuro, which by definition last up 84 days.
Such results point to the need to strengthen the longitudinal paradigm used for this study. Take the issue of the same individual incurring more than one episode of back pain. While bringing simplicity to the current work that helped reduce the complexity of the results, the approach used here to treat episodes as independent units of analysis did not delve into the effects associated with serial episodes by the same patient. Or, take the classification of treatment strategies based on just the first 6 weeks of claims activity alone. While also helping reduce the complexity of results, this second feature precluded delving into what happens when initial diagnoses are subsequently modified for the same patient either within or across episodes. Or, take the absence of any controls for changes in clinical context over time—new procedures, new drugs (and marketing), new modalities, etc. Although the variations in rates of use from the first to the second year of episodes reported in Table 2 hint at these contextual changes, they do not reflect the evolution in practice patterns and modalities that other exploratory analyses indicate occurred across the 9-year study period (results available on request). Any of these adjustments in longitudinal approach would likely shed new light on the results reported here.
A second caveat stems from a timing issue. The bulk of the work that led to release of the guidelines selected for this study occurred during the 2007 to 2012 period, whereas the claims activity examined in this study spanned 2001 to 2009. For many of the episodes selected for this analysis, it is reasonable as such to assume that the providers involved either were not aware of the guidelines or probably had no possibility of even knowing that they were under development. For this reason, the characterizations of congruence deployed here have been carefully crafted to refrain from the use of any standard harboring the explicit expectation that the guideline aspects should somehow have been “followed” per se. Future work is needed that establishes context where provider behavior can be fairly assessed as such.
A third caveat concerns the sorting of EEs/patients into different treatment strategies and time frames for the delivery and filling of procedures and medications. It is likely that this sorting is not random. Those with more severe back pain could self-select into more medically intensive strategies. More severe back pain can also have prompt earlier imaging and surgeries or other guideline-incongruent choices. The propensity adjustments used in this study each provided statistical control for an extraneous factor arguably directly or indirectly related to disease severity. Yet, in the tests here self-reported symptom severity was not one such factor. As shown elsewhere,16 future work would benefit from inclusion of such self-reports as a control.
A fourth concern focuses on the likely underestimates of the true health burden of back pain that stem from the use of the study's total cost measure. This measure did not include presenteeism or job performance impaired by back pain. Much previous work has found that presenteeism is, in fact, the largest component of total health burden, not only on the indirect side but also on the direct side.15,50,51 The mix of complexities due to data availability and analytic requirements that led to presenteeism not being included here is a priority issue for future research in this area.
A final concern is less a limitation and more a direction for future research. The comparisons underlying the results reported here pitted episodes in which care was congruent with selected aspects of guidelines versus episodes in which care was not congruent. The comparisons did not take the additional step of directly pitting incongruent episodes versus congruent episodes per se. The trade-off rendered by this decision meant that analytic complexities deemed better left for future exploration in this line of research could be bypassed, but at the expense of curtailing the capacity to report not just losses averted when care was not congruent but also savings achieved when care was congruent. Future work making this distinction will provide a fuller picture of guideline impact.
This said, this study not only replicated the considerable burden of back pain in this workforce in a framework conducive to upgrading the management of care for LBP but also juxtaposed current LBP care with aspects of guidelines for this care and found that care not congruent was associated with much added cost burden. The evidence here would seem to merit a new outreach to company health plans and providers. The focus of this outreach would be the applicability of these findings for improving the health care and productivity loss outcomes of LBP EEs/patients.
To be effective, the process by which guidelines are implemented is complex and requires the input of many parties.52 The employer perspective merits being considered as one such input. Echoing what has been observed elsewhere in contexts for guideline development and adherence (eg, the National Committee for Quality Assurance's HEDIS rules for LBP imaging for health plan performance assessment53 and the Choosing Wisely initiative49), noteworthy rates of x-rays, MRIs/CTs, surgeries, and medications are being provided for both types of LBP during timeframes that are guideline-incongruent. Much added questionable expense to the employer and ultimately its EEs seems to be occurring as a result. In each instance, the implication is that closer adherence to the guideline will lead to considerable cost savings. On the contrary, incongruent behavior relative to the “first PT/Chiropractor visit within 2 weeks” guideline is also occurring at an appreciable rate but seems instead to be a significant deterrent for long-term total costs. All else held equal, it seems that the less that care is congruent with this latter guideline, the greater the cost savings that will result.
These considerations are not necessarily more important than the improved clinical efficiencies, coordination of care, or other nonfiscal factors that might result from the incorporation of these guidelines. But, the priorities that employers as health care purchasers are increasingly placing on long-term H&P outcomes are arguably no less substantive or relevant than the priorities of providers, patients, health plans, and other affected stakeholders. Evidence like that produced here, which bears on any group's capacity to manage its value/sustainability challenge in the marketplace, belongs in the discussion.
Coaxing Change in Context
Efforts to promote guideline-induced change to incorporate these implications do not, of course, take place in a vacuum. Any such effort in fact would be well served by taking into account the larger context where possible. At the employer for this study, the aggregate-level reductions in total costs per active employee from 2001–2002 to 2008–2009 noted previously encompassed widespread drops across healthy and disease groups. One such area was back problems, for which a drop of just more than $3500 per EE/patient (excluding presenteeism and absenteeism) was recorded.
Among the broad array of disease management, benefit, and prevention initiatives that contributed to these cost reductions for back problems specifically was a staged musculoskeletal educational intervention.54 During 2002 to 2004, the company targeted physicians providing musculoskeletal care at a major company facility with this program. Its features included the assignment of a carefully selected physician/nurse team on-site to each new disability case, usually within 1 to 3 days of the injury. Tests of program effect at the intervention site found that mean days lost per work-related injury dropped from 35.1 to 27.6 and that mean annual indemnity per work-related injury decreased from $9327 to $4493. During the next 5 years, much effort was made to broaden the program to other sites throughout the company. The sharp drops in disability incidents and costs from 2001–2002 to 2008–2009 subsequently found for the aggregate workforce across the company's US sites15 reflect its apparent success in this regard.
Yet, systematic efforts to implement and refine guidelines for LBP care were not a particular focus for this educational intervention or, for that matter, the broad array of management, benefit, and preventive initiatives undertaken by the company from 2001 through 2009. A trend that would seem to register this lack of focus was the little movement shown by the incongruent “hit” rates on the guideline aspects. From the 18-month period from July 1, 2001, to December 12, 2002, to the 18-month period beginning January 1, 2007 (both selected to maximize the observations for pre-post comparisons in this data set), the only aspect to post a significant drop in incongruent rate was the time-excessive NSAID Rx measure (for LBP/neuro: 28.0% [2001–2002] vs 13.6% [2008–2009]; t = 4.1, P = 0.00; for LBP/nonneuro: 23.0% [2001–2002] vs 12.4% [2008–2009]: t = 6.9, P = 0.00). All of the 16 other one-sided t tests for reductions across the set of guidelines applicable to each LBP group found either no change or a modest increase of three percentage points or less. Much potential for promoting well-informed guideline adherence would seem possible in this setting.
Although employers and their occupational health personnel are not necessarily looking to test the accuracy or appropriateness of guidelines for LBP care per se, they are seeking to better manage the long-term total cost outcomes of their LBP EE/patients. Do guidelines hold much promise for advancing this objective? The bottom line answer from this study is an empirically based yes.