A randomized trial of vertebroplasty for painful osteoporotic vertebral fractures.
Buchbinder R, Osborne R, Ebeling P, et al. N Engl J Med 2009;361:557–68.
Osteoporotic vertebral body compression fractures (VCFs) are extremely common among older individuals and are associated with significant axial pain and disability. The vast majority of these injuries will resolve over time with noninvasive measures, including pain control, bed rest, and brace immobilization. However, frail patients or those with significant medical comorbidities may develop adverse health effects during convalescence and others may experience persistent symptoms that ultimately prove to be refractory to these types of methods. In these instances, cement augmentation procedures, such as vertebroplasty and kyphoplasty have gained widespread acceptance as treatments for VCF, which may not only bring about significant pain relief but also facilitate functional recovery. A number of prospective studies have advocated vertebroplasty to be a safe and effective strategy for addressing VCF, which may give rise to improved early outcomes relative to conservative care1–4; these favorable results were also corroborated by a recent meta-analysis, which concluded that there is evidence suggesting that vertebroplasty may be superior to medical management for these injuries, at least in the short term.5 Although these reports characterized the potential advantages of vertebroplasty, all of these investigations were subject to certain methodologic flaws common in surgical trials, including the lack of blinding and placebo controls. In an attempt to overcome these inadequacies, Buchbinder et al performed a multicenter, prospective, randomized, double-blind, placebo-controlled clinical trial involving a series of patients who underwent either vertebroplasty or a sham intervention in an attempt to further elucidate the benefits of this technique during the first 6 months of follow-up.6
Subjects with <12 months of axial pain secondary to 1 or 2 painful osteoporotic VCF confirmed by magnetic resonance imaging (MRI) were randomly assigned to undergo either vertebroplasty or a similar procedure without the use of cement. At the completion of the study, both treatment arms exhibited considerable reductions in overall pain. Nevertheless, vertebroplasty did not give rise to any significant advantages in any of the measured outcomes at any of the time points up to 6 months after treatment; similar improvements were reported by both groups in terms of pain scores, physical functioning, quality of life, and subjective assessments. There were also no differences between the relative incidences of subsequent fractures during the follow-up period. The authors concluded that vertebroplasty did not confer any obvious benefits compared with a simulated intervention, thereby calling into question the value of this therapy.
Although this is a multicenter, prospective, randomized, double-blind, placebo-controlled clinical trial, there are some striking limitations to the experimental design, which significantly downgrade the impact of the study. The majority of the problems revolve around the inclusion and exclusion criteria and recruitment issues frequently encountered in surgical randomized clinical trials. For instance, this investigation included patients with up to 1 year of pain; however, the majority of VCF would be expected to heal within several months; so, the efficacy of vertebroplasty is likely diminished for individuals whose fractures may have begun to consolidate or whose pain may be coming from another source. Over 60% of patients had experienced pain for >6 weeks before the intervention, and there is insufficient data regarding the degree of fracture union; so, it is difficult to determine whether these authors examined the efficacy of vertebroplasty for treating acute fractures or simply assessed its utility for alleviating persistent pain in united fractures.
Furthermore, clinical findings, such as tenderness to palpation over the spinal column, may be indicative of a fracture that has not fully resolved but the role of physical examination was not clearly elucidated in the protocol. Although MRI may certainly provide important information about the status of VCF, its diagnostic accuracy for establishing the age of these injuries is still a matter of considerable debate; so, it is unclear whether all of these fractures were all actually “acute.”
Of the 219 individuals who were found to be eligible to participate in this investigation, only 78 (36%) were enrolled and underwent randomization, which raises concerns regarding a selection bias and its overall generalizability. Given the relatively low rate of enrollment, it is conceivable that patients with partially healed VCF who were in less pain may have been more willing to enter these investigations, which could have reduced the therapeutic effect of vertebroplasty observed in this analysis. Furthermore, only 71 subjects were successfully followed up until the 6-month time point, which is not necessarily a large sample for comparison. All patients in the study were accounted for during follow-up; however, there was no specific mention of the number of patients who crossed over from 1 group to the other. The most concerning issue is that the authors did not report any data for the eligible patients who declined to participate, which is an essential safeguard for any randomized, controlled clinical trial. This information may have shed some light on any type of selection or volunteer biases that may have occurred.
Another controversial point is related to the considerable ambiguity regarding the etiology of low back pain. Although billed as a “sham” procedure, the injection of local anesthetic in close proximity to the zygoapophyseal joints may serve to reduce back pain secondary to facet arthritis, which could have also contributed to the equivalent results exhibited by the 2 treatment arms. Furthermore, the trial compares sham procedure with vertebroplasty but does not assess the natural history of these fractures; conversely, considering a sham treatment and natural history to be one and the same which would certainly contravene the theory behind a placebo response.
Finally, 1 last concern is the authors' handling of the primary outcome measure, which was the amount of pain relief present 3 months after the procedure. Sample size calculations were performed before the study, which was initially designed to detect a change of 2.5 on a 10-point scale. However, with these types of scoring instruments, it has been suggested that the minimally clinically important difference is actually 1.57; as such, this analysis may not have been adequately powered to distinguish small but still important differences between the 2 cohorts. The effect size observed in this study was very small at 3 months (2.6 visual analog scale [VAS[score improvement in the vertebroplasty patients and 1.9 for the control group) compared with that observed by Wardlaw et al 8 (4.1 with kyphoplasty and 2.3 for nonoperative therapies) or Rousing et al 9 (6-point improvement in both the vertebroplasty and control cohorts). This discrepancy may largely be attributed to the disparate patient samples and the age of the fracture at the time of enrollment. The duration of symptoms in the Rousing study was <2 weeks in 80% of patients; in contrast, in the Wardlaw study, the mean age of the fracture was estimated to be 6 weeks whereas 68% of patients in the Buchbinder investigation complained of pain that had lasted anywhere from 6 weeks to 1 year. As suggested by the favorable outcomes of the control group in the Rousing article, we can conclude that when left alone, patients with VCF will almost certainly exhibit improvements in their VAS scores. If patients are enrolled after a longer duration of symptoms they will have less opportunity to develop any further improvements in their VAS score after the intervention.
Recommendation on Impact to Clinical Practice
Despite its significant shortcomings, this multicenter, prospective, randomized, double-blind, placebo-controlled clinical trial still offers a higher degree of evidence regarding the treatment of osteoporotic VCF with vertebroplasty although it does not provide level 1 data. Although Buchbinder et al have certainly cast some doubts about the efficacy of this technique for the particular patient groups included in their study, these findings are not generalizable to all VCF. Therefore, we conclude that there is not sufficient justification to completely abandon vertebroplasty as a method for addressing osteoporotic VCF. We believe that there is now a weak recommendation for incorporating these results into clinical practice such that it seems as if certain individuals will not benefit from vertebroplasty but others may still be appropriate candidates based on personal preference, surgeon experience, and the best available literature.
A randomized trial of vertebroplasty for osteoporotic spinal fractures.
Kallmes D, Comstock B, Heagerty P, et al. N Engl J Med 2009;361:569–79.
Vertebroplasty is an intervention that is intended to alleviate the pain associated with VCFs and enhance the functional recovery of these patients. Although the value of this technique for osteoporotic VCF has been preliminarily established by multiple case series, a number of observational and cohort studies, as well as a randomized, controlled clinical trial, there continues to be a paucity of high-quality evidence supporting the use of vertebroplasty for this application.1–4 Certainly, the relatively favorable natural history of these injuries underscores the need to use a control group to more accurately characterize the efficacy of this intervention.10 Similar to Buchbinder et al 6 who recently evaluated the short-term effects of vertebroplasty, Kallmes et al 11 also published the interim results of a multicenter, prospective, randomized, blinded, placebo-controlled clinical trial comparing vertebroplasty with a sham procedure as part of the Investigational Vertebroplasty Safety and Efficacy Trial.
In this study, subjects with 1 to 3 painful osteoporotic VCF were randomly assigned to treatment consisting either of vertebroplasty or a simulated injection without cement. At 1 month, several different outcome measures were assessed to quantify any improvements in the pain and disability related to their fractures. Of note, participants were permitted to cross-over to the other cohort after 1 month. Although there was a trend toward more clinically meaningful pain relief among vertebroplasty patients, there were no significant differences between any of the clinical scores of the 2 groups at the 1-month time point. Nevertheless, the crossover rate was significantly higher for individuals who had undergone the sham procedure compared with those managed with vertebroplasty.
A priori calculations were performed and subsequently modified after recruitment difficulties were encountered to ensure that this study was sufficiently powered to address the primary outcomes. Unfortunately, this investigation shares many of the same methodologic shortcomings as the Buchbinder et al investigation, such as the inclusion of patients with up to 1 year of symptoms and eliminating subjects that required hospitalization for pain control. However, the most glaring shortcoming was a very low rate of enrollment, which led to the exclusion of a large number of otherwise appropriate candidates. In fact, the authors screened >1800 individuals to identify 431 potential subjects but only 131 (30%) of them were actually enrolled into the study, which one must assume introduced a selection bias that probably confounded the results observed in the analysis. If the authors had provided the clinical characteristics of the eligible patients who did not elect to participate, the actual degree of selection bias could have been better elucidated. For example, it is conceivable that the group declining randomization exhibited different amounts of pain or possessed pathology that may have been either more or less likely to respond to vertebroplasty. This cohort represents the majority of subjects who were considered for this trial; so, how can any reliable conclusions regarding VCF be established from the minority sample?
Once again, the importance of physical examination findings, such as the presence or absence of pain on palpation over the affected levels is also not discussed in their protocol. Similarly, the “sham” treatment that was performed also involved the injection of local anesthetic into the posterior elements, which certainly could have given rise to unanticipated therapeutic consequences and blunted the benefits of vertebroplasty. Aside from being subject to the lack of consensus regarding the radiographic definitions of an “acute” injury, with this protocol, the decision to obtain either a MRI or bone scan for the purpose of determining the age of the fractures was left solely to the discretion of the evaluating practitioner. Unlike Buchbinder et al, these authors did not use prognostic stratification according to the age of the fracture, which is clearly a relevant limitation of this study because the pain arising from these types of bony injuries generally diminishes over time.
Recommendation on Impact to Clinical Practice
Although Kallmes et al raise a number of concerns about the efficacy of percutaneous cement augmentation as a method for treating osteoporotic VCF, we believe that the findings of this investigation do not substantiate their conclusion that the clinical outcomes after vertebroplasty are equivalent to that of a sham treatment. Randomized, controlled clinical trials in surgery are extremely challenging on many fronts, and these investigators must be applauded for completing this investigation. Although this study provides level 1 data based on its methodologic design, its limitations necessitate that its results be downgraded to level 2 evidence. Without question, additional well-designed clinical trials must be completed on a larger scale with more relevant patients to corroborate the validity of this and the accompanying New England Journal of Medicine investigation; until this data become available, we maintain that a weak recommendation to consider changes to current clinical practice is warranted.
Tubular diskectomy versus conventional microdiskectomy for sciatica: a randomized controlled trial.
Arts MP, Brand R, van den Akker ME, et al. JAMA 2009;302:149–58.
Lumbosacral radiculopathy syndromes are some of the most common conditions encountered by spine surgeons worldwide. Although they often carry a favorable prognosis even without surgical intervention, a minority of patients will require surgical decompression. Surgical techniques for nerve root decompression and subtotal diskectomy have evolved since Walter Dandy's12 first report in 1929. Mixter and Barr13 compiled all available surgical case reports in 1934, describing a procedure with generous laminectomy and transdural removal of the offending disc herniation. Since that time, surgical progress has been predicated on technologies that have helped reduce invasiveness. Yasargil14 and Caspar15 independently published the use of the operating microscope for lumbar disc surgery in 1977. Although initially met with scepticism, this technique has become a standard operative technique for many in the treatment of herniated lumbar discs, and numerous series have since described surgical success rates to be in the range of 88% to 98.5%. In 1997, Foley and Smith introduced the technique of transmuscular tubular diskectomy, incorporating a muscle-splitting technique that was expected to reduce tissue trauma, reduce postoperative pain, and accelerate recovery without compromising effectiveness.16 Although there have been 4 randomized controlled trials comparing tubular diskectomy with conventional microdiskectomy, this study represents the first multicenter, double-blind randomized trial evaluating outcome and recovery time for patients undergoing one of these interventions.17
This double-blind randomized controlled effectiveness superiority trial was performed at 7 general hospitals in the Netherlands from January 2005 to October 2006. Three-hundred and twenty-eight patients aged 18 to 70 years who had persistent leg pain (>8 weeks) because of lumbar disc herniation were included in the study. One-hundred and sixty-seven subjects were randomized to receive tubular diskectomy and 161 were randomized to conventional microdiskectomy.
Although patients in both groups enjoyed improvement in all measures with surgery, this study suggests that outcomes were generally better for conventional microdiskectomy. Roland-Morris Disability Questionnaire (RDQ) scores for sciatica were not significantly different in aggregate over the entire 52-week follow-up period; however, at the 52-week time point, RDQ scores were lower for the conventional diskectomy group (3.4) than the tubular diskectomy group (4.7; P = 0.05). Over the entire 52-week follow-up period, VAS scores for leg pain were significantly better for the conventional diskectomy group (14.1 mm vs. 18.3 mm, P = 0.01), but at the 52-week prespecified time point, they were not different (11.6 mm vs. 16.0 mm, P = 0.08). Similarly, VAS score for back pain over the 52-week study period was better in the conventional microdiskectomy group (19.7 mm vs. 23.2 mm, P = 0.04) but not different at the 52-week time point (17.5 mm vs. 22.5 mm, P = 0.06). Finally, although more patients in the conventional microdiskectomy group perceived a good recovery at the final 52-week evaluation (79% vs. 69%, P = 0.05), Kaplan-Meier estimates for time until complete recovery yielded comparable results (2.1 week for conventional microdiskectomy vs. 2.0 weeks for tubular diskectomy). The authors note that intraoperative and postoperative complication rates were comparable between the 2 groups, as were length of stay, days to mobilization and recurrent disc herniation rates.
Use of tubular diskectomy compared with conventional microdiskectomy did not result in a statistically significant improvement in the RDQ score. Tubular diskectomy resulted in less-favorable results for patient self-reported leg pain, back pain, and recovery. The authors concluded that there is no difference in RDQ scores in patients who received tubular diskectomy compared with conventional microdiskectomy.
Arts et al should be commended for performing a thoughtful and meticulous comparison between tubular diskectomy and conventional microdiskectomy. In essence, this is a comparison between 2 different surgical techniques designed to achieve the same technical result, but have potentially different clinical results because of theoretically less tissue destruction with the tubular diskectomy. The authors have tried to address the difficult issue of generalizability in surgical trials by having multiple centers and surgeons involved, therefore improving on the previous studies. We must acknowledge, based on learning curves and technical preference, that not all surgeons perform microdiskectomies or tubular diskectomies in the same way. Accordingly, the relative benefits of each technique will vary by surgeon. The authors themselves note that their multi-institution design introduced these disparities in practice into their data. For example, these authors used a microscope in nearly all of their tubular diskectomies (98%), but used one in only 27% of their conventional microdiskectomies, using loupes in 64% of them and no visual aid at all in the remaining 9%. They tended to remove more disc material in the conventional group than the tubular group, though not statistically significant (6.9 g vs. 6.1 g, P = 0.08). These facts all lend to better generalizability of the study results.
This is a superiority trial; the power and sample size calculations were performed to find a difference. We cannot assume equivalence or noninferiority of 1 technique versus the other. Operative time for tubular diskectomy was significantly longer (47 minutes vs. 36 minutes, P < 0.001). They report a durotomy rate of 8.4% for tubular diskectomy and 4.4% for conventional microdiskectomy (P = 0.18). All interesting findings but may be influenced by surgeon familiarity although all surgeons were apparently facile with both techniques.
The authors have done a great job in planning and executing this trial. They presented a detailed log of subject recruitment; their methods section was very well written; appropriate statistics were used throughout; and they clearly stated the limitations of the study.
Recommendation on Impact to Clinical Practice
The tubular diskectomy technique on its own, which presumably reduces the degree of muscle trauma by using a muscle-splitting exposure, does not seem to confer significant benefit in terms of functional disability, perceived recovery or back and leg pain at 1 year after surgery in the setting of a single-level, unilateral decompression. Although this study is well-performed, its conclusion should not be extrapolated to more complex pathology, including multilevel or bilateral disease. For the patient population included in this study, namely, sciatica secondary to 1-level disc protrusion, a strong recommendation to incorporate these findings into clinical practice should be made. Thus, the tubular technique is not superior and should not be used if costs, burden, or adverse events favor the conventional diskectomy technique. If these factors are neutral, then the clinical experience of the surgeon and, to some degree, patient preference should probably dictate the procedure chosen.
A prospective, cohort study comparing translaminar screw fixation with transforaminal lumbar interbody fusion and pedicle screw fixation for fusion of the degenerative lumbar spine.
Grob D, Bartanusz V, Jeszenszky D, et al. J Bone Joint Surg Br 2009;91:1347–53.
Posterior lumbar instrumentation methods have evolved over the last 40 years to better effect immobilization and enhance fusion rates when clinically indicated. The spine instrumentation industry has seen explosive growth over the last 20 years, creating a vast array of implantable hardware. It is essential that spine surgeons use this emerging technology judiciously, constantly referencing innovation against biomechanical and clinical objectives, balancing creativity with responsibility. Additionally, issues like cost-containment and comparative effectiveness further mandate that we constantly examine the clinical efficacy of novel strategies. Pedicle screw instrumentation has become the dominant method used for establishing posterior fixation, yet alternatives offer promise in terms of biomechanical equivalence and lower complication rates. Translaminar facet screw fixation was described by Montesano et al 18 in 1988 as an option for posterior lumbosacral instrumented fusion. This technique has been evaluated extensively in the literature and has been noted to have a lower neurologic complication rate and infection rate.19–22 Moreover, it has been demonstrated in vitro to be biomechanically equivalent to pedicle screw fixation in all planes except extension.21 Certainly, as minimally invasive spine surgery expands, techniques like translaminar facet screw fixation may be attractive alternatives to pedicle screw fixation. On the other hand, there are a number of limitations to translaminar facet fixation. For example, the indications for this intervention are more limited, both clinically and structurally. Also, this procedure has been associated with higher nonunion rates than pedicle screw fixation in longer-term studies.23 To further examine the relative merits of these 2 techniques, Grob et al have undertaken a thoughtful, prospective study comparing translaminar fixation with transforaminal lumbar interbody fusion and pedicle screws fixation.24
This prospective, observational cohort study was conducted at a spine center in Switzerland. A total of 120 subjects met the inclusion criteria: 57 in the translaminar (TS) group and 63 in the transforaminal lumbar interbody fusion (TLIF) group. Each surgeon used his or her preferred instrumentation technique. After 2 years, there was no difference between the 2 groups in the mean reduction in the Core Outcome Measures Index score, ratings of global outcome, or satisfaction with treatment. They noted comparable reoperation rate at a mean 3.4 years after index surgery (17.5% TS vs. 12.7% TLIF, P = 0.62). This study showed trends for higher pseudarthrosis rates in the TS group (5.3% TS vs. 1.6% TLIF) and higher adjacent segment problems in the TLIF group (7% TS vs. 10% TLIF). The authors concluded that the 2 fusion techniques, TS and TLIF, were associated with almost identical patient oriented outcomes.
Although the authors indicated that patients were equally distributed among the surgeons, the lead author performed 57 of 120 procedures (all TS) and the other 4 authors performed the remaining 63 (all TLIF). Once patients were enrolled, surgeons were allowed to choose the procedure with which they felt more comfortable; graft choice was also not standardized, allowing surgeons to choose graft as they deemed appropriate. Additionally, the patients in the TS group were significantly older (P < 0.001) and had significantly more comorbidities than the patients in the TLIF group (P < 0.001). These differences may introduce significant bias into the study and are readily used to explain differences in blood transfusion rates and hospital lengths of stay. Furthermore, the table comparing baseline characteristics of the 2 groups is lacking important prognostic variables, such as smoking, baseline outcome scores, leg or back pain predominant, and indications for surgery.
The most fundamental weakness with this study relates to the indications used for surgery. The authors themselves describe heterogeneity in the success of surgery depending on the primary indication, although they did not provide this subgroup analysis. In a separate study, this same group described that patients with disc height loss of >20% and patients without anterolisthesis fared better with TS.7 In this study, however, the authors broadened the inclusion criteria to include degenerative spondylolisthesis, potentially reducing their reported success rate.
Recommendation on Impact to Clinical Practice
In summary, although Grob et al have provided a thoughtful, prospective comparison between translaminar facet screw fixation and TLIF with pedicle screw fixation, the inherent limitations in this study significantly compromise the conclusions that they draw. Moreover, the power and sample size calculations were performed to find a difference; we cannot assume equivalence or noninferiority as we would be committing a type 2 error. A large sample size is needed to prove equivalence or noninferiority. Thus, no changes to current clinical practice are recommended based on the evidence provided by this study.
Does treatment (nonoperative and operative) improve the two-year quality of life in patients with adult symptomatic lumbar scoliosis: a prospective multicenter evidence-based medicine study.
Bridwell KH, Glassman S, Horton W, et al. Spine 2009;34:2171–8.
The optimal treatment of patients with adult lumbar scoliosis remains difficult to define. This group of spine patients has demonstrated significant disability but also a high risk for perioperative morbidity.25 Historically, studies have focused predominantly on radiographic outcomes. Although a number of retrospective studies have demonstrated a clinical impact of surgical treatment, a prospective comparison of operative and nonoperative treatment has not been previously reported.
The Adult Deformity Outcomes section of the Spinal Deformity Study Group reports the results of a multicenter prospective nonrandomized comparative study of patients 40 to 80 years old with symptomatic lumbar scoliosis who underwent operative or nonoperative treatment for their spinal disorder. Two-hundred and fifty-six patients were enrolled in the study; 2-year follow-up data were available for 52% of nonoperative patients and 96% of the operative group. Seventy-five patients underwent nonoperative management, and 85 had operative management. Each patient, in consultation with the surgeon, decided on his or her treatment. Patient-derived outcomes were assessed using standard questionnaires, including the Scoliosis Research Society quality of life instrument, the Oswestry Disability Index, and the numerical rating scale back and leg pain scores.
An overall analysis comparing all patients in both groups (“unmatched analysis”) was performed. In an effort to minimize the impact of selection bias in this nonrandomized study, propensity score matching was performed to adjust for baseline differences between the treatment groups (“matched analysis”). A clinically and statistically significant difference between the groups in the primary outcome, the Scoliosis Research Society subscore, was found at the 2-year follow-up. Patients who underwent surgery had, on average, a significant improvement in their clinical outcome measures whereas those patients who underwent nonoperative treatment did not. This was true for both the matched and unmatched analyses.
Although a multicenter, prospective study was performed, the methodologic limitations are very significant. The fundamental weaknesses of this study relate to the characterization of the patient population. The inclusion criteria are broad and likely include both degenerative and adult scoliosis, 2 distinctly different deformities. The patient selection criteria are unclear, and there is no information about the proportion of eligible patients who were enrolled. Although the patients were “consecutively enrolled,” specific criteria for enrolment that can be applied in other settings are not described, and it is impossible to know how many patients were eligible but either were not offered enrolment or declined to enroll. Ninety percent of the enrolled patients were from 5 of the 15 participating centers. This means that between 7 and 10 patients were enrolled annually from those centers; the other centers enrolled, on average, 2 patients each during the entire study period. Without further information, it is impossible to know how this enrolled subgroup compares with the overall population of adults with symptomatic lumbar scoliosis.
A further limitation of the study, as the authors acknowledge, is that the majority of patients in the nonoperative arm (86 of 167) were lost to follow-up. Although the demographics and baseline variables of these patients were not significantly different from the patients who were retained in the study that does not mean that the patients were similar in other ways, such as their response to treatment. Fifty percent follow-up is not acceptable in an elective patient population study.
For those nonoperative patients who remained in the study, the nature of nonoperative care is not defined beyond being actively or nonactively managed. The details of treatment, physical therapy, injections, and medications, are not reported. Clinically, nonoperative care is directed by the patients' symptoms (e.g., back pain, leg pain, and neurogenic claudication). To not define these parameters produces a heterogeneous cohort group for comparison with surgical management.
The fate of the patients in the nonoperative group remains uncertain, and interpretation of the data can vary widely. The fact that patients in the nonoperative group did not significantly worsen may suggest that adult degenerative scoliosis is not a progressive disease process. It may also be interpreted that nonoperative treatment can prevent disease progression. Lastly, the data could suggest that the patients lost to follow-up had more significant disease and, therefore, either crossed-over or sought treatment elsewhere, or their symptoms resolved and they lost incentive for follow-up. Both the internal validity and the clinical applicability (external validity) of the study are, therefore, significantly limited.
Recommendation on Impact to Clinical Practice
It is not advisable to make any conclusions about the relative effectiveness of operative and nonoperative treatment for adult lumbar scoliosis based on this study. Achieving a high degree of exchangeability of operative and nonoperative cohorts in comparative studies of benign, painful spinal conditions is extraordinarily difficult, whether the study is randomized or nonrandomized. Study design elements and statistical techniques, such as matching and propensity scores, can help to adjust for differences between groups but only for specified factors among patients who remain in the study and who have similar diagnosis. No changes to current clinical practice can be recommended based on this article.
Degenerative spondylolisthesis: does fusion method influence outcome? Four-year results of the Spine Patient Outcomes Research Trial.
Abdu WA, Lurie JD, Spratt KF, et al. Spine 2009;34:2351–60.
The Spine Patient Outcomes Research Trial (SPORT) study has added invaluable evidence to the treatment of lumbar degenerative spondylolisthesis. Both short (2-year) and intermediate (4-year) outcomes have been reported, demonstrating a benefit to surgical management over nonsurgical treatment.26,27 The optimal surgical management of degenerative spondylolisthesis remains uncertain. Fusion techniques vary widely without validated measurements of the relative effectiveness of 1 technique over another. This SPORT subgroup analysis attempts to define the outcomes of 3 different fusion techniques and providing a direct comparison of the effectiveness of the techniques.
Abdu et al present an analysis of the degenerative spondylolisthesis arm of the SPORT. In that multicenter study, patients were either followed as part of a cohort study or, if they elected enrolment in the randomized arm, assigned to undergo operative or nonoperative treatment. The procedure performed for any operative patient was left to the surgeon's discretion. The authors' objective was to determine whether the procedure type had a significant effect on patient outcomes measure at regular intervals up to 4 years after surgery. Outcomes assessed included patient-oriented measures of health-related quality of life (Short Form-36 and Oswestry Disability Index), disease-specific measures (Stenosis and Back Pain Bothersome scales) and an assessment of likely fusion status made by the treating surgeon based on radiographs. Statistical adjustments were made for baseline differences between the treatment groups, and tests were performed to determine the presence of differences in outcome between procedures.
Of the 380 operative patients (combined observational and randomized cohorts) for whom there were records, 23 (6%) had a laminectomy alone, 80 (21%) underwent a laminectomy with posterolateral arthrodesis, 213 (56%) had a decompression with an instrumented (pedicle screw) arthrodesis, and 63 (17%) had a “360°” procedure, consisting of a decompression, posterior instrumented arthrodesis, and an interbody fusion performed either through a posterior (transforaminal or posterior) or an anterior approach. At baseline, patients undergoing a 360° procedure were younger and were less likely to have osteoporosis, central stenosis, or severe stenosis.
Although there were some statistically significant differences in outcomes in the early (≤2 year) follow-up period, these differences were not maintained at later evaluations. There was some inconsistency in the nature of the outcome differences, with 360° procedure demonstrating better Short Form-36 bodily pain and physical function at 2 years. The noninstrumented fusion cohort was not favored at any time point for any measure and demonstrated statistically significant worse Oswestry Disability Index scores at 1 year after surgery. No statistically significant differences were found between groups at 3- or 4-year follow-up. All groups showed improvements from baseline that were maintained at final follow-up.
This study has several limitations. The study was not designed to test for differences in outcomes between procedures and, therefore, may not have sufficient power to detect a small difference with statistical significance. Although the overall study included a randomized clinical trial arm, patients were not randomized to receive specific procedures. The decision regarding specific surgical treatment was made by the physician and the patient based on disease-specific factors (location of stenosis and levels of involvement), patient-specific factors (age, bone quality, etc.), and physician experience. Furthermore, this study combined patients from both the observational and randomized arms. There is likely, therefore, to be selection bias in patient allocation such that factors associated with undergoing a particular procedure might also be associated, positively or negatively, with outcome. Confirmation of selection bias is found in the baseline differences between the treatment cohorts noted above (age, extent of stenosis, and location of stenosis). This selection bias limits our ability to determine the true efficacy of each fusion technique; however, the numerous variables that influence surgeon and patient preference for a fusion technique are considerable, and, perhaps, a cohort design is the best method to determine effectiveness and enhance generalizability. Longer follow-up may demonstrate differences in outcomes. As noted by the authors, Kornblum et al 28 reported a detrimental effect of lumbar pseudarthrosis on patient outcomes, albeit in a smaller study. The impact of pseudarthrosis may be seen when the SPORT studies report 5- to 10-year outcomes.
Lastly, the determination of fusion status is another limitation of the study. As acknowledged by the authors, surgeons documented fusion status of their own patients in an unblinded, unverified manner using plain radiographs with no consistent fusion criteria. Although a significant limitation, this approach again mimics routine clinical care in evaluating patients; only with specific clinical and radiographic findings are more diagnostically accurate methods of determining fusion status undertaken. This limitation may, therefore, limit the internal validity of the study more than its generalizability.
Recommendation on Impact to Clinical Practice
The study represents the largest cohort and best available evidence to date of patients with lumbar degenerative spondylolisthesis. It does effectively address its first question: all 3 surgical treatments resulted in sustained clinical improvements over pretreatment health status. The study does not adequately address the second question—“Do different fusion approaches result in different patterns of change across the follow-up interval?” The study highlights the need for further prospective, comparative, and cost-effectiveness study of different fusion techniques in the management of lumbar degenerative spondylolisthesis. Based on the data in this study, all 3 arthrodesis techniques have similar beneficial effects on clinical outcomes for patients with lumbar spinal stenosis and degenerative spondylolisthesis. For patients with degenerative spondylolisthesis, a strong recommendation to incorporate these findings into clinical practice should be made. Meaning any of the 3 fusion techniques can be used for the treatment of this disease, which technique is selected is based on surgeon expertise and patient preference.