Skip Navigation LinksHome > Blogs > The Spine Blog
The Spine Blog
Friday, January 30, 2015

While the accuracy of administrative billing databases may not seem like the most exciting spine-related topic, the proliferation of literature based on these databases makes it relevant. Level 1 RCTs will always represent the gold-standard in evidence-based medicine, however, these studies are very expensive to perform and usually answer only one specific question in a highly selected population. As such, spine researchers have turned to administrative billing databases in order to capture large numbers of “real-world” patients at relatively low costs. In order for the results of such analyses to be valid, the coding needs to accurately reflect the clinical situation. To address this question of the accuracy administrative billing data, Dr. McGuire and his colleagues from Boston used an administrative coding algorithm to classify surgical indication and technique and then compared the findings of the algorithm to the indications and technique documented in the medical record. They identified 477 patients who had undergone lumbar spine surgery at one institution and classified the indication for surgery as degenerative disk disease, disk herniation, spinal stenosis, spondylolisthesis, or scoliosis. Surgical technique was classified as fusion or no fusion, and fusion patients were subclassifed as combined anterior-posterior vs. single approach, instrumented vs. uninstrumented, and one or two level vs. three or more level fusion. Using the chart review as the gold standard, they found the coding algorithm had a sensitivity and specificity of over 80% for every diagnosis other than spinal stenosis (sensitivity 33%) and degenerative disk disease (sensitivity 72%).  The accuracy was excellent (sensitivity/specificity > 95%) for determining if a fusion was performed and for surgical approach (i.e. combined AP vs. single approach). However, the accuracy dropped off substantially in identifying the use of instrumentation and number of levels fused.


This is a helpful study as it lets the reader get a sense of what aspects of administrative database studies are likely valid and which ones should be questioned. The algorithm was relatively accurate for diagnosis other than for spinal stenosis. This is in contrast to a similar study that evaluated the algorithm compared to the diagnosis assigned to patients in the Spine Outcomes Research Trial (SPORT), which found a sensitivity of 88% for spinal stenosis.1 The cause of the discrepancy between the two studies is not clear, but the algorithm used a “hierarchical” approach that assumed fusion should be relatively uncommon for spinal stenosis. While this held true for SPORT, in which 11% of stenosis patients underwent fusion,2 58% of the stenosis cohort in the current study underwent fusion. Another possibility is that patients that were documented as having stenosis in the chart actually had listhesis or scoliosis that could have been detected had the radiographs been reviewed. The algorithm was much more accurate in assigning a diagnosis to spondylolisthesis or scoliosis patients, where the hierarchical model assumed that fusion would be performed on most of these patients. The poor specificity in determining the use of instrumentation was surprising. There are straightforward CPT codes for lumbar instrumentation, and one would assume that these would be coded accurately. However, only 28 fusion patients in this cohort underwent uninstrumented fusion, and the algorithm inappropriately classified 12 of these patients as instrumented cases. It is possible that the low numbers involved make this finding less robust. The poor accuracy for capturing multilevel fusions is also surprising given that CPT codes for numbers of levels fused are also unambiguous. It is possible that the use of ICD-9 codes, which are less specific, may have reduced the accuracy. This paper indicates that the accuracy of administrative billing databases varies according to what is being coded. The discrepancy with the findings of the study using the SPORT data also indicates that coding is a local phenomenon, and the accuracy and precision of coding likely varies substantially across hospitals. This study reminds us to be cautious when evaluating administrative database studies and echoes concerned voiced previously about the accuracy of ICD-9 coding.3


Please read Dr. McGuire’s study in the January 15 issue. Does this article change how you view papers based on administrative billing data? Let us know by leaving a comment on The Spine Blog.


Adam Pearson, MD, MS

Associate Web Editor



1.            Martin BI, Lurie JD, Tosteson AN, et al. Indications for spine surgery: validation of an administrative coding algorithm to classify degenerative diagnoses. Spine (Phila Pa 1976) 2014;39:769-79.

2.            Weinstein JN, Tosteson TD, Lurie JD, et al. Surgical versus nonsurgical therapy for lumbar spinal stenosis. The New England journal of medicine 2008;358:794-810.

3.            Golinvaux NS, Bohl DD, Basques BA, Fu MC, Gardner EC, Grauer JN. Limitations of administrative databases in spine research: a study in obesity. Spine J 2014;14:2923-8.


Friday, January 23, 2015

There have been relatively few long-term comparative studies evaluating surgical and non-operative outcomes for lumbar spinal stenosis (SpS). Dr. Lurie and his colleagues published the 8 year data from the Spine Patient Outcomes Research Trial (SPORT) SpS cohort in the January 15 issue. Most readers are familiar with the design of the trial, which included a simultaneous RCT and observational study comparing surgical and non-operative outcomes for SpS patients. Analysis of the RCT with the traditional intent to treat (ITT) methodology was hampered by the high rate of crossover:  only 70% of patients assigned to surgery actually had surgery within 8 years, while 52% of patients assigned to non-operative treatment did have surgery. As such, the primary analysis was an as-treated (AT) analysis, which was performed separately for the RCT and observational groups. In the randomized group, surgery patients had significantly better patient-reported outcomes through 4 years, though the benefit of surgery was no longer significant after 5 years. In contrast, the surgery patients in the observational cohort were still doing significantly better than their non-operative counterparts even at 8 years. Both analyses controlled for baseline differences. Overall, there was an 18% re-operation rate, with 10% of surgical patients undergoing an additional operation for recurrent stenosis or progressive spondylolisthesis. There was a high loss to follow-up, with only 55% of the RCT and 52% of the observational cohort patients included in the 8 year analysis.

This report reflects the difficulty of long-term studies comparing surgery to non-operative treatment. Not surprisingly, loss to follow-up was nearly 50%, and those who were lost were significantly different than those who did follow-up; they were older, sicker, less educated, and had worse outcomes at 2 years. The authors suggested that this likely led to an overestimate of the absolute degree of improvement at 8 years, but that the treatment effect estimate was likely accurate given that those lost to follow-up did worse with both surgical and non-operative treatment. Of the randomized patients assigned to non-operative treatment, 52% had surgery within 8 years, indicating that patients are unlikely to remain compliant with randomization if non-operative treatment fails. Even among the observational cohort patients who initially chose non-operative treatment, 25% had gone onto surgery within 3 years. The combination of loss to follow-up and crossover, which was inevitable in this study design, makes interpreting the outcomes difficult, especially since the results were substantially different in the RCT AT analysis and the observational cohort. The authors suggested that the RCT AT analysis might be less affected by selection bias than the observational cohort as baseline differences between the surgical and non-operative groups were less pronounced—though still significant—in the RCT AT analysis. This view led them to conclude that the benefit of surgery likely decreases beyond 5 years. Similar results were reported in the Maine Lumbar Spine Study that showed diminution of the advantage of surgery from 5-10 years, though some significant differences did persist in the long-term.
1 On the whole, the literature does suggest that surgery results in better outcomes out to 5 years but that the benefits of surgery likely erode over time as patients age and develop recurrent stenosis, spondylolisthesis, and adjacent segment degeneration. In a perfect world, an RCT would be performed with no crossover and 100% follow-up for twenty years. In reality, SPORT has likely done as good as is possible in a study of human subjects, and the authors and the participating patients should be congratulated on their efforts to produce long-term data.


Please read Dr. Lurie’s article in the January 15 issue. Does this change your views of long-term outcomes in spinal stenosis? Let us know by leaving a comment on The Spine Blog.

Adam Pearson, MD, MS

Associate Web Editor  




1.            Atlas SJ, Keller RB, Wu YA, Deyo RA, Singer DE. Long-term outcomes of surgical and nonsurgical management of lumbar spinal stenosis: 8 to 10 year results from the maine lumbar spine study. Spine 2005;30:936-43.



Saturday, January 17, 2015

Interspinous devices such as X-Stop were designed to provide a less invasive option to laminectomy for elderly stenosis patients at increased risk for perioperative complications. While initial industry sponsored trials comparing X-Stop to non-operative treatment were promising, subsequent independent studies comparing X-Stop to laminectomy showed high reoperation rates for the interspinous devices.1,2 Dr. Lonne and his colleauges from Norway designed an RCT in which 180 patients were to be randomized to X-Stop or direct decompression using a less invasive technique in which the midline structures were persevered. Patients were between 50 and 85 years old, presented with neurogenic claudication that was relieved with flexion, and had one to two level stenosis. Patients with Grade 1 degenerative spondylolisthesis were allowed to be included. Their power analysis indicated 180 patients should be enrolled.  After enrolling 96 patients, the midpoint analysis revealed the X-Stop group had an odds of re-operation over 6 fold greater than the direct decompression group, so the study was stopped early. Twenty five percent of X-Stop patients had reoperation for persistent or recurrent symptoms compared to 5% in the direct decompression group. Patient reported outcomes between the two groups were not significantly different out to two years, though the comparisons were underpowered due to stopping the study at 50% enrollment. Five percent (2/41) of the direct decompression group had an incidental durotomy, while 7% (3/41) had an epidural hematoma requiring re-operation. One of the latter patients had persistent cauda equina symptoms 2 years post-operatively. Two X-Stop patients had spinous process fractures and one had dislocation of the device.


The authors of this well-designed RCT made the appropriate decision to stop the study when the exceedingly high re-operation rate in the X-Stop group was detected. The re-operation rates were very similar to a prior RCT comparing X-Stop to laminectomy (26% and 6%, respectively).1 While most X-Stop patients improve initially, the rate of recurrent symptoms is quite high and seems likely due to subsidence of the device or spinous process fracture in a population with high rates of osteopenia. The current study can be criticized for including patients with low grade degenerative spondylolisthesis (DS) as a prior paper has reported an increased risk of spinous process fracture in the DS population.3 Additionally, decompressing DS patients without performing a fusion is controversial, and the study would have been cleaner by excluding this group. The 7% rate of epidural hematoma leading to re-operation seems high compared to the 1% rate reported in the Spine Patient Outcomes Research Trial.4 Medical complications such as myocardial infarction, pneumonia, and DVT were not reported, and these may have been less common in the X-Stop group. This study adds to the growing literature indicating that X-Stop has an unacceptably high reoperation rate compared to laminectomy without much demonstrable benefit.


Please read Dr. Lonne’s article on this topic in the January 15 issue. Does this change how you view interspinous devices? Let us know by leaving a comment on The Spine Blog.

Adam Pearson, MD, MS

Associate Web Editor






1.            Stromqvist BH, Berg S, Gerdhem P, et al. X-Stop Versus Decompressive Surgery for Lumbar Neurogenic Intermittent Claudication: Randomized Controlled Trial With 2-Year Follow-up. Spine 2013;38:1436-42.

2.            Zucherman JF, Hsu KY, Hartjen CA, et al. A prospective randomized multi-center study for the treatment of lumbar spinal stenosis with the X STOP interspinous implant: 1-year results. Eur Spine J 2004;13:22-31.

3.            Kim DH, Shanti N, Tantorski ME, et al. Association between degenerative spondylolisthesis and spinous process fracture after interspinous process spacer surgery. The spine journal : official journal of the North American Spine Society 2012;12:466-72.

4.            Weinstein JN, Tosteson TD, Lurie JD, et al. Surgical versus nonsurgical therapy for lumbar spinal stenosis. The New England journal of medicine 2008;358:794-810.


Friday, January 09, 2015

There has been a greater focus on cost-effectiveness analysis (CEA) across medical disciplines lately due to a push to consider the value of care. The most rigorous and widely-accepted form of CEA is that in which the incremental cost and benefit of an intervention is compared to another less costly, less effective intervention in order to determine the incremental cost-effectiveness ratio (ICER). This allows for comparison of ICERs for interventions across different fields and is ultimately used by some centralized health systems to guide decisions about which treatments will be provided. Dr. Nwachukwu and his colleagues from New York reviewed the available CEA literature for spine surgery in order to assess the state of knowledge as well as the quality of the studies. They identified 20 studies that performed CEA evaluating various aspects of spine surgery, with 14 being published after 2010. The vast majority focused on lumbar interventions. The quality of studies varied substantially; only 6 of the 14 lumbar studies included a comparison treatment, and only 4 computed the ICER of the intervention compared to non-operative treatment. In general, most studies found spine surgery to be cost-effective when compared to non-operative treatment. The one major exception was decompression and fusion for spinal stenosis without listhesis, with the SPORT CEA reporting an ICER of $258,200 for this intervention (the ICER for decompression alone was $47,900). Short time horizons were the major limitation across studies, which likely resulted in underestimating the cost-effectiveness of interventions as their benefit likely persisted beyond the typical 2 to 4 year period that was studied.


The field of spine CEA is relatively immature, with the majority of studies having been performed in the last 5 years. Few studies have looked at the cost-effectiveness of surgery compared to non-operative treatment, which is the big-picture question about which most payers are concerned. The studies that did compare surgery to non-operative treatment were for lumbar disk herniation, lumbar stenosis, and cervical radiculopathy. Isthmic spondylolisthesis, cervical myelopathy, and deformity have not been well-studied from a cost-effectiveness standpoint. Many analyses have focused on technical variations (i.e. comparing bone graft materials, fusion techniques, MIS vs. open surgery, etc.), which are important to consider, but are probably less pressing for healthcare systems than the larger cost issues such as whether to operate or to include fusion with lumbar decompression. This review raises the question of whether or not CEA is relevant in the United States’ fee for service system. Given that no effort has been made to incentivize cost-effective care, CEA is unlikely to directly affect treatment decisions in the US now or in the near future. The current fee for service model tends to incentivize the most expensive procedures. Until reimbursement policies take cost-effectiveness into account, CEA will likely remain an academic exercise.


Please read Dr. Nwachukwu’s article in the January 1 issue. Does this change your view of CEA in spine surgery? Let us know by leaving a comment on The Spine Blog.

Adam Pearson, MD, MS

Associate Web Editor

Thursday, January 01, 2015

Much has been written about cord signal changes (CSC) and outcomes following surgery for cervical spondylotic myelopathy (CSM), with some authors suggesting that they indicate more severe disease and portend worse surgical outcomes compared to patients without CSC. However, the relationship between CSC and physical exam findings has not been well explored.  Dr. Nemani and colleagues from St. Louis and New York examined the topic by reviewing physical exam findings of 43 patients with CSC who had undergone surgery. The majority (72%) had a diagnosis of CSM, though patients with OPLL, DISH, herniated disk, and trauma were also included. They evaluated patients for the presence of concordant reflexes (defined as normal reflexes cranial to the CSC, hyporeflexia at the levels with CSC, and hyperreflexia caudal to the CSC), clonus, Hoffman sign, Romberg sign, and gait dysfunction. They found that only 26% of patients had concordant reflexes, though2/3 had some degree of hyperreflexia. Patients with CSC cranial to C4 were significantly more likely to have concordant reflexes than patients with CSC caudal to C4.Two thirds had a positive Hoffman sign, and 60% had gait dysfunction. Far fewer had clonus or a positive Romberg sign. Hyperreflexia was correlated with Hoffman sign, clonus, and gait dysfunction.


This novel study was most remarkable for what it failed to show:  contrary to their hypothesis, reflex changes were generally not concordant with CSC. Given their strict definition of concordant reflexes, it is not surprising that the “textbook” combination of normo-, hypo-, and hyperreflexia were rarely seen. Indeed, at C5-6, the only level for which concordance requires normo-, hypo-, and hyperreflexia, only one of twelve patients was found to have reflexes concordant with the CSC. Likewise, concordance was more common when the CSC was cranial to C4, most likely due to the fact that concordance only required general hyperreflexia throughout the upper extremities. While the majority of patients with CSC did have Hoffman sign, clonus, gait dysfunction, or hyperreflexia, a substantial minority of patients with CSC did not exhibit those findings. This study could have been strengthened by including a group of CSM patients without CSC to see if the abnormal exam findings were more common in the presence of CSC. Also, a group of asymptomatic patients with CSC would have been interesting to evaluate. This paper is unlikely to directly affect how we evaluate and treat patients with CSC, however, it does serve as a strong reminder that the diagnosis of CSM is based on a combination of history, physical exam findings, and advanced imaging.


Please read Dr. Nemani’s article in the January 1 issue. Does this affect how you see the relationship between CSC and physical exam findings? Let us know by leaving a comment on The Spine Blog.

Adam Pearson, MD, MS

Associate Web Editor

About the Blog

Spine Journal
This Blog provides a forum for discussion about high impact articles published in Spine, including the bi-annual publication of "Evidenced-Based Recommendations for Spine Surgery." Website users can use this forum to discuss how the articles have affected their practice and query the authors about their findings and recommendations.