Secondary Logo

Clinical Scenarios for Which Spinal Mobilization and Manipulation Are Considered by an Expert Panel to be Inappropriate (and Appropriate) for Patients With Chronic Low Back Pain

Herman, Patricia M., ND, PhD*; Hurwitz, Eric L., DC, PhD; Shekelle, Paul G., MD, PhD*; Whitley, Margaret D., MPH*; Coulter, Ian D., PhD*

doi: 10.1097/MLR.0000000000001108
Original Articles

Background: Spinal mobilization and manipulation are 2 therapies found to be generally safe and effective for chronic low back pain (CLBP). However, the question remains whether they are appropriate for all CLBP patients.

Research Design: An expert panel used a well-validated approach, including an evidence synthesis and clinical acumen, to develop and then rate the appropriateness of the use of spinal mobilization and manipulation across an exhaustive list of clinical scenarios which could present for CLBP. Decision tree analysis (DTA) was used to identify the key patient characteristics that affected the ratings.

Results: Nine hundred clinical scenarios were defined and then rated by a 9-member expert panel as to the appropriateness of spinal mobilization and manipulation. Across clinical scenarios more were rated appropriate than inappropriate. However, the number patients presenting with each scenario is not yet known. Nevertheless, DTA indicates that all clinical scenarios that included major neurological findings, and some others involving imaging findings of central herniated nucleus pulposus, spinal stenosis, or free fragments, were rated as inappropriate for both spinal mobilization and manipulation. DTA also identified the absence of these imaging findings and no previous laminectomy as the most important patient characteristics in predicting ratings of appropriate.

Conclusions: A well-validated expert panel-based approach was used to develop and then rate the appropriateness of the use of spinal mobilization and manipulation across the clinical scenarios which could present for CLBP. Information on the clinical scenarios for which these therapies are inappropriate should be added to clinical guidelines for CLBP.

*RAND Corporation, Santa Monica, CA

Office of Public Health Studies, University of Hawaii, Honolulu, HI

This work has been funded by a cooperative agreement (U19) from the National Center for Complementary and Integrative Health (NCCIH) (Grant No. 1U19AT007912-01).

The authors declare no conflict of interest.

Reprints: Patricia M. Herman, ND, PhD, RAND Corporation, 1776 Main Street, P.O. Box 2138, Santa Monica, CA 90403. E-mail:

The ultimate goal of all medical research is to ensure that patients receive care that is appropriate, or “suitable or proper in the circumstances.”1 Appropriate care has been defined as: “Health care in which the expected clinical benefits (eg, improved symptoms) of care outweigh the expected negative effects (eg, adverse drug effects) to such an extent that the treatment is justified.”2 It is estimated that 20% of health care costs are wasteful, that is, going to inappropriate or useless care.3 Although most people would agree that all patients should get appropriate care, the challenges are defining appropriateness, determining which therapies are appropriate, and ensuring delivery of this care.

In response to wide variations in clinical practice patterns, the RAND Corporation and the University of California, Los Angeles, (UCLA) pioneered a method to study the appropriateness of care.4–8 This RAND/UCLA Appropriateness Method (RUAM) takes advantage of the available evidence base but also draws on the clinical acumen and experience of practitioners. The RUAM has been the most widely used and studied method for defining and identifying appropriate care. The estimates of appropriateness generated by the RUAM have been found to be reliable with test-retest reliability >0.9 using the same panelists 6-8 months later.9 Moreover, the results across several panels with similar discipline compositions for the same procedure are reproducible with kappa statistics (0.5–0.7) similar to those of some common diagnostic tests.10,11 The RUAM estimates have also been found to be valid. Panelists’ ratings of appropriateness were consistent with the literature, and follow a logical clinical rationale.9 When sets of appropriateness results from the RUAM and another approach were applied to a sample of patients who received a procedure, the RUAM clinical scenarios accounted for all patients in the sample, whereas the other approach only addressed 70% of patients and mainly missed patients for which the procedure was inappropriate.12 Of importance, the RUAM results have favorable predictive ability—that is, patients treated in accordance with the criteria have better outcomes than those who receive another or no treatment. Favorable predictive ability has been found for coronary angiography,13,14 carotid endarterectomy,15 and coronary revascularization.16,17 It has also been found that later clinical trials targeting specific patient types have validated panelists’ ratings made before that (or much other) evidence existed.15 Finally, the sensitivity and specificity of the RUAM method to identify inappropriate overuse and underuse of health care has been estimated at between 68% and 99% and 94% and 97%, respectively.18

Chronic low back pain (CLBP) is the most common type of chronic pain,19,20 and is costly to the health care system and employers.21–25 According to numerous systematic reviews and meta-analyses,26–28 a number of nonpharmacologic therapies have been found effective for chronic back pain and are now included in guidelines.29–31 Spinal mobilization and manipulation, 2 of the recommended therapies,30–33 are most commonly delivered by chiropractors, osteopaths, and physical therapists.34 In the United States 30%–50% of those with spinal pain have seen a chiropractor,23,24,35 and 15%–34% have used physical therapy.23,24 Therefore, it is reasonable to believe that a large number of those with CLBP are receiving spinal mobilization and manipulation. Although systematic reviews have shown these therapies to be generally safe and effective, the question remains whether they are appropriate for all patients with CLBP.

This study used the RUAM to determine the appropriateness of spinal mobilization and manipulation for different types of patients with CLBP, and the key patient characteristics associated with appropriateness. A separate paper will apply these results to determine the prevalence of appropriate and inappropriate care in chiropractic practice.

Back to Top | Article Outline


The RUAM6,36,37 used a modified Delphi panel of clinical and content experts, and their knowledge and clinical acumen, to translate the available evidence on a therapy into ratings of appropriate, inappropriate, or equivocal for each patient type (clinical scenario) considered. Nine panelists were chosen based on their clinical expertise across different specialties and disciplines, and diversity of geographic location. Our panel included: 1 orthopedist, 1 osteopath, 1 internist, 2 chiropractors, 1 physical therapist, 1 radiologist, and 2 health services researchers. Each received a $1000 honorarium for their participation.

Panelists were first presented with the latest evidence on the effectiveness and safety of each therapy in terms of a detailed systematic review,38 and then asked to rate, using a 1–9 scale, the extent to which the benefits of the therapy outweigh its risks for each clinical scenario. Ratings of 7–9 (ie, the therapy is appropriate) were to be given if: “The expected health benefit (eg, increased life expectancy, relief of pain, reduction in anxiety, improved functional capacity) exceeds the expected negative consequences (eg, mortality, morbidity, anxiety, pain, work time lost) by a sufficiently wide margin that the procedure is worth doing, exclusive of cost.”5 The instructions given to panelists and definitions of terms used in the rating process are found in a detailed, publicly available RAND report.39

The clinical scenarios (patient types) to rate were organized into sections for ease of rating.39 Once one (the first) clinical scenario in a section was rated, the others only differed by 1 or 2 patient characteristics and could be evaluated quickly. The project staff compiled the list of clinical scenarios using the literature review, clinical expert advice, and the clinical scenario list used for an earlier study on spinal manipulation for acute low back pain.40,41 These scenarios categorized patients in terms of their history, symptoms, physical and radiographic findings, and response to prior treatment. The list of clinical scenarios to rate needed to be comprehensive enough to capture all types of patients with CLBP, detailed enough that the procedure would be equally appropriate or inappropriate for all patients in a scenario, and manageable so that all scenarios could be rated within a reasonable amount of time. Scenarios deemed implausible were dropped. On average, once used to the process, a panelist can rate about 150–200 indications per hour.6 The RUAM was applied to 900 clinical scenarios for spinal mobilization and 900 for spinal manipulation. In each case, 450 clinical scenarios were rated under 2 conditions: (1) there has been no other adequate conservative care for this episode or (2) nonmanipulative conservative care for this episode had failed.

For this study, we defined manipulation of the low back as a controlled, judiciously applied dynamic thrust (adjustment), which could include extension and rotation of the lumbar region, of high or low velocity and low-amplitude force directed to spinal joint segment within patient tolerance. Mobilization of the low back was defined as a controlled, judiciously applied force of low velocity and variable amplitude directed to spinal joint segments. Mobilization procedures usually do not take joints beyond the passive range of motion and do not result in joint cavitation.

Panelists rated each clinical scenario twice: (1) first individually at home; and (2) then during a one day face-to-face meeting and after discussion with the other panelists. At the beginning of the face-to-face meeting each panelist was given a personalized printout showing their at-home ratings and the distribution, but not the identities, of all other panelists’ ratings. The home and face-to-face rating sessions occurred in February and March of 2015, respectively.

Panelists were asked to make their ratings using their own best clinical judgement and content knowledge (rather than their perceptions of what other experts might say) and considering an average patient currently presenting to an average North American practitioner who performs this procedure in an average care-providing facility. Consensus is reported, but not required. More detail on the RUAM is found in the RUAM manual.6

Back to Top | Article Outline

Statistical Analyses

The 1–9 appropriateness ratings given by each panelist after the face-to-face meeting were analyzed to generate 1 of 3 overall ratings for spinal mobilization and manipulation for each clinical scenario: appropriate, equivocal, and inappropriate. The first analysis determined whether there was disagreement across the panelists’ appropriateness ratings for any clinical scenario. For a classic 9-member panel, agreement was defined by having at least 7 of the ratings in any 3-point region of the scale, and disagreement was defined as having at least 3 panelists’ ratings in the 1–3 range and at least 3 in the 7–9 range. If there was no disagreement and the median value of the ratings across the panel is 1–3, then the therapy was rated as inappropriate for that clinical scenario. If there is no disagreement and the median value of the ratings is 7–9, the therapy was rated as appropriate. The appropriateness for a therapy for a clinical scenario was rated as equivocal if: (1) most panelists gave a rating of 4, 5 or 6—that is, most believed that benefits generally equaled risks; (2) panelists gave widely polarized ratings—that is, there was disagreement; or (3) panelists’ ratings were scattered across the scale—that is, there was substantial uncertainty as to appropriateness—and the median value was in the 4–6 range. The last 2 of these identify potential targets for future research.

The amount of agreement and disagreement, the dispersion of the ratings measured by the mean absolute deviation from the median, and the proportions of clinical scenarios rated as appropriate, equivocal and inappropriate were compared between the at-home and in-meeting ratings. Calculations of agreement and appropriateness were conducted using Microsoft Excel and Java.

We used decision tree analysis (DTA) to see if simplified rules could be identified with regard to the elements (patient characteristics) of the clinical scenarios that predict the appropriateness of spinal mobilization and spinal manipulation.42 DTA looks for the smallest number of patient characteristics or combinations of characteristics that can provide an accurate prediction of appropriate or inappropriate ratings. These simplified rules can provide information that is not always obvious from individual ratings across hundreds of clinical scenarios. We identified the set of patient characteristics (Appendix, Supplemental Digital Content 1, that variously made up the clinical scenarios, defined each scenario as the presence or absence (or for some such as pain, a particular level) of each characteristic, and included all as predictor variables in the DTA.

Twenty-six clinical scenarios (Chapter 1139) were not included in the DTA because they were each made up of single patient characteristics not included in any other scenario. These single characteristics (eg, grade IV spondylolisthesis) were each included in a scenario “that would otherwise be rated as appropriate,” and their ratings are described separately. As the clinical scenarios did not always mention all patient characteristics, we assumed that if a characteristic was not mentioned it was absent in that scenario. When predicting a rating of inappropriate (appropriate) we compared clinical scenarios with that rating to those without that rating—that is, to those with ratings of either appropriate (inappropriate) or equivocal.

The DTA was conducted using the C4.5 algorithm43 of the R statistical package (available at: Tree branches are formed based on the characteristic that provides the most information gain at each step, and the algorithm ends by returning to remove branches that are no longer useful.

The project was reviewed and determined to be exempt by the RAND Human Subjects Protection Committee.

Back to Top | Article Outline


The panelists reported varying times for the ratings performed at home, but 3 hours was roughly the norm. Table 1 compares the initial at-home ratings to the final face-to-face ratings in terms of the ratings given, their dispersion (mean absolute deviation) across panelists, and the number of clinical scenarios where there was agreement, a spread of ratings (uncertain), and clear disagreement. Ratings and agreement increased, and dispersion and disagreement decreased between the 2 sets of ratings. For both sets and therapies, appropriateness ratings were significantly higher when nonmanipulative conservative care for this episode had failed than when it had not been tried (paired t tests P<0.01).



Table 2 gives the number of clinical scenarios rated as appropriate, inappropriate or equivocal for the final face-to-face ratings. More clinical scenarios were rated appropriate than inappropriate. However, about two-thirds of clinical scenarios were rated equivocal, and most of these cases were a result of a spread of ratings (uncertain) with a median rating in the 4–6 range. Details on the ratings given to each clinical scenario are found in the published RAND report.39



Figures 1 and 2 show the results of the DTAs identifying the patient characteristics that best predict a clinical scenario being rated as inappropriate (vs. appropriate or equivocal) for mobilization and manipulation, respectively. As can be seen the decision trees for each therapy are very similar. They only differ by the addition of a question about the presence of minor neurological findings (ie, at least one of the following: asymmetrically decreased reflexes in lower extremity; documented dermatomal or peripheral nerve sensory changes which may include deficit, paresthesia, and hyperesthesia; nonprogressive unilateral muscle weakness and/or parasthesia that follows a radicular pattern) in the mobilization flowchart.





For both therapies the presence of major neurological findings (ie, at least one of the following: neurological signs of lumbar myelopathy; progressive unilateral muscle weakness and/or motor loss documented by repeat examination over time; sensory deficits other than related to dermatomes or peripheral nerves; and/or electrodiagnostic findings of acute and/or progressive radiculopathy) was the best predictor of a clinical scenario being rated inappropriate. Some clinical scenarios without major neurological findings but with imaging findings of central herniated nucleus pulposus, spinal stenosis, or free fragments, and no physical findings of joint dysfunction—that is, no clear indication for manipulation or mobilization—were also rated as inappropriate. These cases occurred for mobilization when the patient also had a laminectomy and minor neurological findings; or no laminectomy, but biomechanical or psychosocial stress and did not have a favorable response to prior manipulation or mobilization. The cases where manipulation was rated inappropriate were similar with the exception that minor neurological findings were not required.

Some of the patient characteristics that helped define the clinical scenarios (Appendix, Supplemental Digital Content 1,, eg, sciatic nerve irritation, whether spine radiographs were performed, current pain, previous conservative care) were not important in terms of a scenario being rated as inappropriate in the DTA. The DTA was also fairly accurate. Only 7 clinical scenarios of 874 (900−26) for mobilization were misclassified as appropriate or equivocal when they were actually rated as inappropriate, and only one clinical scenario was misclassified as inappropriate when it was actually rated as appropriate or equivocal; an overall error rate of 0.9%. The same numbers for manipulation were 6, 3, and 1.0%. The main differences between the DTA and actual ratings seem to relate to whether there were physical findings of joint dysfunction.

We also performed DTA predicting a rating of appropriate, versus inappropriate or equivocal, for both therapies. It turns out to be more complex to predict appropriateness than to predict inappropriateness as the decision trees were less accurate and they required inclusion of all patient characteristics. Seventeen clinical scenarios of 874 for mobilization were misclassified as appropriate when they were actually rated as inappropriate or equivocal, and 13 clinical scenarios were misclassified as inappropriate or equivocal when they were actually rated as appropriate; a 3.4% error rate. The same numbers for manipulation were 26, 9, and 4.0%. In any case, it is likely more important for patient safety that inappropriate care be identified.

Table 3 shows the percent of clinical scenarios that each patient characteristic helped classify and the direction of its influence for both therapies and both predictions. As can be seen, the presence of major neurological findings, as the first split in the tree, was involved in the accurate prediction of an inappropriate rating for all (100%) of clinical scenarios. Findings on imaging provided information on inappropriateness for 92.6% of scenarios—that is, all those without major neurological findings that made it past the first filter. The patient characteristics most useful to the accurate prediction of an appropriate rating were no or minor findings on imaging (100%), followed by previous laminectomy (associated with a lower likelihood of appropriateness). The complexity of predicting appropriateness is illustrated by the spread and substantial size of the percentages across all patient characteristics for those analyses.



As mentioned above, 26 clinical scenarios were excluded from the DTA because they were each rated as being added to an unspecified scenario “that would otherwise be rated as appropriate.” Both therapies were rated inappropriate for 3 types of patients (those with possible abdominal aortic aneurysms suspected by physical examination, with definite abdominal aortic aneurysm by history or imaging, or with radiographic contraindications to spinal mobilization or manipulation) and manipulation was rated as inappropriate for patients with grade IV spondylolisthesis. Both therapies were rated as appropriate for grades I and II spondylolisthesis. All other grades of spondylolisthesis, clotting disorders with various results with regard to prothrombin time, and possible abdominal aortic aneurysm with vascular calcifications on radiography, but not suspected by physical examination, were all rated equivocal.

Back to Top | Article Outline


This study applied the internationally recognized and well-validated RUAM to obtain expert panel ratings of the appropriateness of spinal mobilization and manipulation for CLBP. Nine hundred clinical scenarios were developed to cover the full range of patients presenting with CLBP. Of these 178 were rated as appropriate, 98 were rated as inappropriate, and 624 were rated as equivocal (agreement that benefits roughly equal risks, or uncertainty as to whether benefits are greater or less than risks but with a median rating of their being roughly equal) for spinal mobilization. The numbers for spinal manipulation were 173 as appropriate, 106 as inappropriate, and 621 as equivocal. Between the initial (at-home) and final (face-to-face) ratings agreement and the number rated appropriate increased, and the dispersion of ratings and disagreement decreased. DTAs indicated that the main contributor to a rating of inappropriate for both therapies was the presence of major neurological findings.

Over half of the clinical scenarios received a rating of equivocal because of a lack of agreement in ratings across panelists and a median rating in the 4–6 range (benefits roughly equal to risks). These ratings may indicate that these scenarios require further research.

One feature of the RUAM is that panelists rate a comprehensive array of potential clinical presentations (scenarios) for the procedure of interest. This is performed to meet one goal of the RUAM: to enable classification of all possible patients. Studies have shown that the RUAM was far better than informal methods at including descriptions of all patients and especially those for whom the procedure was inappropriate,12 and that panel disagreement was not concentrated on rarely seen scenarios.44 Nevertheless, there is some evidence that panel reproducibility is higher for clinical scenarios regularly seen in practice.10

Along these lines note that the numbers of clinical scenarios for which a therapy is rated appropriate, equivocal, and inappropriate provide no indication of the actual number of patients affected. A future article from this study will present results from the examination of the medical records of a representative sample of chiropractic patients with CLBP to determine the proportion of patients that present with each of these scenarios and the proportion of patients receiving appropriate and inappropriate chiropractic care. Our constructed scenarios are intended to provide a category for each type of patient with CLBP and do not relate to any data about the frequency these scenarios might be encountered in a practice. Nevertheless, information on the clinical scenarios rated inappropriate is still useful for guideline development.

This study had the advantages of using an internationally recognized and well-validated method to translate available evidence and expert clinical acumen into ratings of the appropriateness of 2 therapies for an exhaustive list of 900 clinical scenarios which could present as CLBP. The approach, however, is not without its limitations. Panelists were presented with a full synthesis of all available evidence on the safety and effectiveness/efficacy of spinal mobilization and manipulation for CLBP. However, the available evidence has gaps, including the fact that clinical trials only include a subset of patients with CLBP and the analyses of trial data do not often present results by distinct clinical scenarios. Therefore, panelists’ clinical acumen was essential to the process and not without its own biases. Panelists were also asked to rate the appropriateness of 900 clinical scenarios for each therapy, and it is difficult to perform so many ratings without error. However, the ability of the DTA to predict so accurately was one indication of at least the internal consistency of the ratings. Finally, our results relate to appropriateness but not to medical necessity, which would have required another rating step in the RUAM.

A well-validated expert panel-based approach was used to develop and then rate the appropriateness of the use of spinal mobilization and manipulation across an exhaustive list of clinical scenarios which could present for CLBP. For both therapies, more clinical scenarios were rated appropriate than inappropriate, but the majority were rated equivocal either due to agreement that the benefits of the therapy are roughly equivalent to its risks, or due to a range of ratings whose median lies in the range of rough equivalence. If these scenarios turn out to represent a substantial number of patients seen, this last could be a fruitful target for future research. Nonetheless, all clinical scenarios that included major neurological findings, and some others involving imaging findings of central herniated nucleus pulposus, or spinal stenosis, or free fragments were found to be rated as inappropriate for both spinal mobilization and manipulation. This information should be added to clinical guidelines recommending these therapies for patients with CLBP.

Back to Top | Article Outline


The authors would like to acknowledge and thank the 9 members of the panel for spinal manipulation and mobilization for chronic low back pain. The authors also acknowledge the contributions of Katharina Best and Seifu Chonde who ran the DTA.

Back to Top | Article Outline


1. Definition of Appropriate in US English Oxford Dictionaries. Oxford, UK: Oxford University Press; 2018.
2. Segen’s Medical Dictionary. Appropriate care. 2011. Available at: Accessed February 22, 2018.
3. Berwick DM, Hackbarth AD. Eliminating waste in US health care. JAMA. 2012;307:1513–1516.
4. Brook R. Appropriateness: the next frontier. Br Med J. 1994;308:218–219.
5. Brook RH, Chassin MR, Fink A, et al. A method for the detailed assessment of the appropriateness of medical technologies. Int J Technol Assess Health Care. 1986;2:53–63.
6. Fitch K, Bernstein SJ, Aguilar MD, et al. RAND/UCLA Appropriateness Method User’s Manual. Santa Monica, CA: RAND Corporation; 2001.
7. McClellan M, Brook RH. Appropriateness of care: a comparison of global and outcome methods to set standards. Med Care. 1992;30:565–586.
8. Shekelle P. The appropriateness method. Med Decis Making. 2004;24:228–231.
9. Merrick NJ, Fink A, Park RE, et al. Derivation of clinical indications for carotid endarterectomy by an expert panel. Am J Public Health. 1987;77:187–190.
10. Shekelle PG, Kahan JP, Bernstein SJ, et al. The reproducibility of a method to identify the overuse and underuse of medical procedures. New Engl J Med. 1998;338:1888–1895.
11. Tobacman JK, Scott IU, Cyphert S, et al. Reproducibility of measures of overuse of cataract surgery by three physician panels. Med Care. 1999;37:937–945.
12. Kahn KL, Park RE, Vennes J, et al. Assigning appropriateness ratings for diagnostic upper gastrointestinal endoscopy using two different approaches. Med Care. 1992;30:1016–1028.
13. Selby JV, Fireman BH, Lundstrom RJ, et al. Variation among hospitals in coronary-angiography practices and outcomes after myocardial infarction in a large health maintenance organization. New Engl J Med. 1996;335:1888–1896.
14. Normand S-LT, Landrum MB, Guadagnoli E, et al. Validating recommendations for coronary angiography following acute myocardial infarction in the elderly: a matched analysis using propensity scores. J Clin Epidemiol. 2001;54:387–398.
15. Shekelle PG, Chassin MR, Park RE. Assessing the predictive validity of the RAND/UCLA appropriateness method criteria for performing carotid endarterectomy. Int J Technol Assess Health Care. 1998;14:707–727.
16. Kravitz RL, Laouri M, Kahan JP, et al. Validity of criteria used for detecting underuse of coronary revascularization. JAMA. 1995;274:632–638.
17. Hemingway H, Crook AM, Feder G, et al. Underuse of coronary revascularization procedures in patients considered appropriate candidates for revascularization. New Engl J Med. 2001;344:645–654.
18. Shekelle PG, Park RE, Kahan JP, et al. Sensitivity and specificity of the RAND/UCLA Appropriateness Method to identify the overuse and underuse of coronary revascularization and hysterectomy. J Clin Epidemiol. 2001;54:1004–1010.
19. Johannes CB, Le TK, Zhou X, et al. The prevalence of chronic pain in United States adults: results of an internet-based survey. J Pain. 2010;11:1230–1239.
20. Institute of Medicine. Relieving Pain in America: A Blueprint for Transforming Prevention, Care, Education, and Research. Washington, DC: The National Academies Press; 2011.
21. Davis MA. Where the United States spends its spine dollars: expenditures on different ambulatory services for the management of back and neck conditions. Spine. 2012;37:1693–1701.
22. Gaskin DJ, Richard P. The economic costs of pain in the United States. J Pain. 2012;13:715–724.
23. Gore M, Sadosky A, Stacey BR, et al. The burden of chronic low back pain: clinical comorbidities, treatment patterns, and health care costs in usual care settings. Spine. 2012;37:E668–E677.
24. Ivanova JI, Birnbaum HG, Schiller M, et al. Real-world practice patterns, health-care utilization, and costs in patients with low back pain: the long road to guideline-concordant care. Spine J. 2011;11:622–632.
25. Smith M, Davis MA, Stano M, et al. Aging baby boomers and the rising cost of chronic back pain: secular trend analysis of longitudinal Medical Expenditures Panel Survey data for years 2000 to 2007. J Manipulative Physiol Ther. 2013;36:2–11.
26. Chou R, Deyo R, Friedly J, et al. Nonpharmacologic therapies for low back pain: a systematic review for an American College of Physicians Clinical Practice Guideline. Ann Intern Med. 2017;166:493–505.
27. Chou RMD, Atlas SJMDMPH, Stanos SPDO, et al. Nonsurgical interventional therapies for low back pain: a review of the evidence for an American Pain Society Clinical Practice Guideline [Review]. Spine. 2009;34:1066–1077; 1078–1093.
28. Agency for Healthcare Research and Quality. Noninvasive Nonpharmacological Treatment for Chronic Pain: A Systematic Review Effective Health Care Program. Rockville, MD: Agency for Healthcare Research and Quality; 2018.
29. Brosseau L, Wells GA, Poitras S, et al. Ottawa Panel evidence-based clinical practice guidelines on therapeutic massage for low back pain. J Bodywork Movement Ther. 2012;16:424–455.
30. The Diagnosis and Treatment of Low Back Pain Work Group. VA/DoD Clinical Practice Guideline for Diagnosis and Treatment of Low Back Pain, Version 20. Washington, DC: The Office of Quality, Safety and Value, VA, & Office of Evidence Based Practice, US Army Medical Command; 2017.
31. Qaseem A, Wilt TJ, McLean RM, et al. Noninvasive treatments for acute, subacute, and chronic low back pain: a clinical practice guideline from the American College of Physicians. Ann Intern Med. 2017;166:514–530.
32. Chou R, Qaseem A, Snow V, et al. Diagnosis and treatment of low back pain: a joint clinical practice guideline from the American College of Physicians and the American Pain Society. Ann Intern Med. 2007;147:478–491.
33. Wenger HC, Cifu AS. Treatment of low back pain. JAMA. 2017;318:743–744.
34. Hurwitz EL. Epidemiology: spinal manipulation utilization. J Electromyography Kinesiol. 2012;22:648–654.
35. Martin BI, Gerkovich MM, Deyo RA, et al. The association of complementary and alternative medicine use and health care expenditures for back and neck problems. Med Care. 2012;50:1029–1036.
36. Brook RH. Assessing the appropriateness of care—its time has come. JAMA. 2009;302:997–998.
37. Nair R, Aggarwal R, Khanna D. Methods of formal consensus in classification/diagnostic criteria and guideline development. Semin Arthritis Rheum: Elsevier. 2011;41:95–105.
38. Coulter ID, Crawford C, Hurwitz EL, et al. Manipulation and mobilization for treating chronic low back pain: a systematic review and meta-analysis. Spine J. 2018;18:866–879.
39. Coulter ID, Whitley MD, Hurwitz EL, et al. Determining the Appropriateness of Spinal Manipulation and Mobilization for Chronic Low Back Pain Indications and Ratings by a Multidisciplinary Expert Panel. Santa Monica, CA: RAND Corporation; 2018.
40. Shekelle PG, Adams AH, Chassin MR, et al. The Appropriateness of Spinal Manipulation for Low-back Pain : Indications and Ratings by an All-chiropractic Expert Panel. Santa Monica, CA: RAND; 1992.
41. Shekelle PG, Adams AH, Chassin MR, et al. The Appropriateness of Spinal Manipulation for Low-Back Pain: Project Overview and Literature Review. Santa Monica, CA: RAND Corporation; 1991.
42. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer; 2016.
43. Quinlan JR. C45: Programs for Machine Learning. Burlington, MA: Morgan Kaufmann Publishers; 1993.
44. Park RE, Fink A, Brook RH, et al. Physician ratings of appropriate indications for three procedures: theoretical indications vs indications used in practice. Am J Public Health. 1989;79:445–447.

chronic low back pain; appropriateness of care; RAND/UCLA Appropriateness Method; decision tree analysis; spinal mobilization and manipulation

Supplemental Digital Content

Back to Top | Article Outline
Copyright © 2019 Wolters Kluwer Health, Inc. All rights reserved.