For the PROBAST “predictors” domain, the majority of the study samples were rated as having a low risk of bias (40.5%). Only a small number of study samples were rated as having moderate (19.0%) or high risk of bias (11.9%). For 28.6% of the study samples, the risk of bias was rated as unclear (Table 3) because the presented information was insufficient to evaluate whether differences occurred in the assessment of the screening tools either across participants or compared with the development study. The reasons for increasing the risk of bias related to differences in the assessment of the screening tools across participants and differences in the assessment of screening tools compared with the development study.
Study samples that validated the ALBPSQ, the OMPSQ, and the PICKUP—tools that include work-related questions—sometimes did not report information on employment status. This could mean that participants were all employed, or that some of the participants were unemployed, but it was not reported.50 Furthermore, those studies that did report on employment status did not always administer these tools in a similar way across participants. For example, Hurley et al. instructed participants to fill out ALBPSQ work-related questions as best they could, even when they were unemployed.43,44 When these questions were left blank, the mean score of the other questions was used as replacement. In the study by Grotle et al.,33 it is noted that for participants who were unemployed, OMPSQ work-related questions were replaced by the mean score of the other questions.
Furthermore, across the included studies, significant variation was observed in the applied screening tool cutoff points used to categorize patients. Selective reporting of results based only on cutoff values other than those specified in the original development study for the screening tool, was considered a risk for underestimation or overestimation of the screening tool's predictive accuracy. Moreover, variable use of cutoffs prohibits to estimate the influence of a given setting on the performance at the recommended (original) threshold. For example, for the ALBPSQ, the standard cutoff originates from Linton and Halldén,62 who used 105 as their cutoff score for detecting poor prognosis in the form of sick leave. Hurley et al.44 and Vos et al.124 only reported results using a cutoff of 112 and 72, respectively, for the outcome sick leave. In addition, few studies also treated screening tool scores as continuous without additional reporting of the cutoff values from the screening tool's development study38 (Table 4).
For the PROBAST “outcomes” domain, the majority of the study samples were assigned an unclear risk of bias (40.5%), mainly due to insufficient information to evaluate blinding, or a moderate risk of bias (42.9%). None of the study samples were rated as having low risk of bias. For 16.7% of the study samples, the risk of bias was not rated because no performance measures were reported for the outcomes of interest (Table 3). The reasons for increasing the risk of bias related to the validity of the outcome, overlap between predictors and outcomes, differences in the assessment of outcomes across participants, differences in the assessment of outcomes compared with the development study, and blinding.
Outcome measures that mixed outcome domains were rated as inadequate. Also, composite outcomes that combined outcome measures or outcome domains were considered inadequate.28 For example, the 10-item modified version of the Oswestry Disability Index contains items that assess activity limitations and participation restrictions.22 Mixed or composite outcomes have the potential to increase the event rate and thus the statistical power. However, they may be misleading when the outcome domains included in the outcome differ in importance to patients, the number of events in the outcome domains of greater importance is small, and the magnitude of effect differs markedly across the outcome domains.72
Next, overlap between predictor and outcome assessment was frequently observed and considered as problematic. Several studies used items of the investigated screening tool, measured at follow-up, as primary outcome. For instance, Linton and Boersma61 used the OMPSQ in its entirety during the outcome assessment, selecting the items on pain, activity limitations, and sick leave. Studies also often included outcomes that showed overlap with domains assessed by the screening tool items. In the study by Grotle et al.,32 both the activity items of the ALBPSQ and the items of the Roland–Morris Disability Questionnaire (RMDQ) outcome measure address activity limitations. This overlap may lead to overestimation of the predictive performance of the screening tool.91,117
For all studies, outcomes were defined and determined in a similar way across participants. However, they were not always defined and determined similarly to those in the development studies. Indeed, although different outcomes most probably have different predictors, a number of studies targeted outcome domains (eg, pain intensity through OMPSQ items and activity limitations through the RMDQ and not participation restrictions through accumulated sick leave)64 which differed from the development study. Other studies focused on similar outcome domains, but used other measures (eg, activity limitations through a NRS and not the RMDQ due to the large amount of missing data).50
In addition, some studies focused on similar outcome domains and used the same outcome measures as the development study, but used different cutoff points for the outcome measures from those used in the development study. For example, large differences were observed for sick leave. Vos et al.124 defined long-term sick leave as >7 days off work, while Linton and Hallden62 initially defined long-term sick leave as being sick listed for >30 days (Table 4).
Information on blinding was most often not reported, which could either mean that the outcome assessment was not blinded or that it was blinded but not described. In cases where studies reported on blinding of outcome assessment, researchers usually applied blinding.24
There was a huge difference between sample sizes of the validation studies. Sample sizes varied considerably at follow-up, ranging from <100 participants,18,25,38,43,44,64,81 over 500 to 1000 participants,41,53,77,78,80 to >1500 participants.115 Also, the number of outcome events differed largely between studies ranging from 14 to 291. The most frequently observed time intervals were 3, 6, and 12 months92 (see Table 4 for an overview).
The number of events (ie, the number of individuals with the outcome event) was not reported in a large number of studies5,14,21–25,33,38,64,66,80 and considered inappropriate in 5 studies.28,31,44,63,81 These studies reported <20 events, raising the issue of overfitting (ie, the probability of an event is typically underestimated in low-risk patients and overestimated in high-risk patients).4,85
Studies sometimes performed multiple follow-ups, reporting results on the predictive validity for one or only a selection of follow-ups (eg, follow-ups at 2- and 4-week intervals until discharge or study completion at 6 months, report of results for 6-month follow-up).24 Time between screening and outcome assessment was considered inappropriate when results only reported on follow-ups of <3 months, as chronic pain is defined as pain ≥3 months (eg, six weeks).66 Follow-ups >12 months were also considered inappropriate, as people's (mental) health status changes during the follow-up period and the baseline information becomes increasingly less accurate as time passes (none of the studies). In addition, follow-ups that varied across participants (eg, at treatment discharge, dependent on the number of therapy treatments)43 were deemed inappropriate. Surprisingly, most studies did not present any theoretical considerations underpinning the choice of a specific follow-up timeframe (Table 4).
Dropout attrition is often poorly reported or presented in a way that prevents readers from being able to fully understand the risk of attrition bias. Studies often limit themselves to reporting the dropout rate. We considered dropout as inappropriate when >20%96 of the participants were lost at follow-up.5,20,21,28,44,53,80,92 However, dropout can occur for a number of reasons that may lead to differential dropout, such as motivation (participants lost interest), mobility (participants moved and are no longer able to continue participation), morbidity (participants experience illness preventing their participation), or mortality (participants die before study completion). For example, a low psychosocial risk group may lose more unmotivated participants—that in turn may have different outcomes due to being unmotivated—than a high psychosocial risk assessment group, and this differential dropout may lead to differences in outcomes measured among the remaining participants. Reasons for dropout are, however, rarely specified among the included studies. Furthermore, although characteristics of dropout (ie, baseline characteristics: eg, age, sex, pain intensity, and pain duration) should be available to examine whether systematic differences exist between those who completed a study and those who dropped out,36 only few studies reported on the differences between completers and noncompleters.5,28,44,50,53,80,81,98 Of these studies, some provided a detailed tabulation of the characteristics and statistical comparison,50 whereas other studies only reported the characteristics for which differences were found.5 Further, numerous studies do not mention whether differences were examined, which could either mean that differences were examined for all or some baseline characteristics but none were found, or no differences were tested.55
Finally, studies did often not report on missing values or how they were or would have been handled,78 which could either mean that there were no missing data or that missing data were present but not described. Missing values were considered inappropriately handled when complete-case analysis was applied.92 They were judged as appropriately handled when multiple imputation was used.74 For example, Karran et al.50 used Little's Missing Completely at Random test to determine whether values were missing completely at random and used a maximization algorithm to impute missing values.
For the PROBAST “analyses” domain, the majority of study samples were assigned a moderate risk of bias (76%), and only a few study samples were rated as low risk of bias (9.5%). For 14.3% of the study samples, no risk of bias labels was assigned because no performance measures were reported for the outcomes of interest. The reason for increasing the risk of bias related to the poor use of the performance measures.
Statistical analyses were found appropriate when they reflected both calibration (ie, agreement between predicted and observed event rates) and discrimination (ie, the screening tool's ability to distinguish between patients developing and not developing the outcome of interest) components of predictive validity for pain and related outcomes.74 This was only the case in 2 studies.50,115 These studies also reported more recently introduced performance measures (eg, net benefit). Moreover, not all studies reported performance measures for pain and related outcomes despite assessing those outcomes. Some studies reported on the course of particular pain and related outcomes. For example, Grotle et al.31 reported the course of pain intensity, disability, and sickness absence from baseline across follow-ups, but reported no information on the predictive validity of the ALBPSQ for those outcomes, except for disability where odds ratios were provided. Other studies reported differences in mean scores on the screening tool for particular outcomes, used change scores for particular outcomes, or reported on composite outcomes. For example, Dunstan et al.18 reported differences in mean ALBPSQ scores between those who did and did not return to work. Dagfinrud et al.14 assessed functional limitations at baseline and follow-up; however, the predictive validity of the OMPSQ was examined for functional improvement, and the categorization of those that were improved and those that were not was based on change scores. Finally, George and Beneciuk28 assessed pain intensity and disability; yet, discriminative validity was only examined for recovery, a composite pain intensity and disability outcome. Still others assessed pain and related outcomes, but only reported performance measures related to outcomes that were not within the scope of the current review. For instance, Heneweer et al.38 assessed pain intensity, disability, work absenteeism, and self-reported recovery, but only reported area under the curve values for the ALBPSQ total and subscale scores in predicting recovery or nonrecovery at final follow-up (Table 4).
Seven screening tools were identified, all developed for use in primary care settings to predict chronic pain (HKF-R10, PICKUP) or chronic disability (ALBPSQ/OMPSQ, OMPSQs, OMSQs, PBSI, and STarT Back) in patients with back pain. Notably, we found no tools for the prediction of pain-related distress, a key indicator of health, or for the prediction of acute pain onset, including postoperative pain. These appear to be significant gaps in the literature.101
We assessed the quality of the evidence of 32 studies including 42 study samples aiming to validate the predictive value of identified screening tools. Overall, studies showed a moderate risk of bias, which varied largely from domain to domain. Here, we discuss the most notable methodological problems.
The success of initial studies revealing the value of psychosocial risk factors in predicting chronic pain problems has boosted research in this area. However, some of the original studies were designed with specific (clinical) groups in mind. An example is the ALBPSQ, which was designed to target a working population. Some items that are directly related to work (eg, “If you take into consideration your work routines, management, salary, promotion possibilities, and workmates, how satisfied are you with your job?”) are therefore inapplicable to a nonworking population. The authors have addressed this problem in various ways. Some replaced the missing scores for those items by the mean for nonworking patients.33 Others asked patients to fill out those questions related to either current paid or unpaid work.43,44 Likewise, screening tools were developed for patients with musculoskeletal, in particular back pain, but studies have also investigated the value of the tools in other patient groups (eg, neck pain).124 Sometimes, items have been adapted accordingly and/or left out. There is a lack of evidence, however, to suggest that these changes are appropriate for the populations in question.
Some of the identified screening tools were developed to screen for psychosocial risk factors (“yellow flags”), or, at least, are presented as such in studies. Some cautionary notes are warranted. First, all screening tools also include items that could be categorized otherwise (eg, pain duration and disability compensation). Second, screening tools often contain items that could equally well be the primary outcomes (pain intensity, disability, and days off work). Although this may be less of a problem when simply aiming to predict, it is premature to explain the predictive power of these instruments in terms of psychosocial processes. Indeed, given that it is generally known that the best predictor of events in the future is their occurrence in the present or past, it remains to be investigated whether the predictive validity of screening tools is due to the overlap between predictor and outcome.91,117 To address this problem, one may examine whether tools are able to predict outcomes, beyond the predictive power of baseline pain and pain-related disability.
Most studies are not in line with the current guidelines for reporting measures of performance.110,111 In fact, there is a large disparity in reported performance measures. Many studies reported conventional performance measures, often reporting either calibration (ie, how close predictions are to observed outcomes) or discrimination (ie, screening tool's ability to correctly distinguish the 2 outcome classifications of event vs nonevent). However, the reporting of both performance measures is crucial. Furthermore, most studies do not consider the clinical consequences of decisions made using a screening tool. Therefore, there is the implicit assumption that false-positive (ie, patient being treated unnecessarily) and false-negative (ie, patient not getting a treatment that (s)he would benefit from) predictions are equally harmful (ie, equally weighted). More recent studies50,115 do consider the relative harms or benefits of these alternative clinical outcomes. They apply novel performance measures such as net benefit (ie, the expected utility of a decision to treat patients at some threshold, compared with a decision based on an alternative policy such as treating nobody)75,110,111,120,121 (see also www.decisioncurveanalysis.org).
There are some limitations to our review. First, we used a strict search strategy. We excluded batteries of questionnaires and tools that were not originally developed in the context of pain. This may have resulted in missing instruments that are potentially valuable. For example, the Amsterdam Preoperative Anxiety and Information Scale (APAIS) was originally developed to evaluate patient's preoperative anxiety and need for preoperative information regarding the scheduled surgery and anesthesia.70 Subsequently, this tool was used to predict postoperative pain.46,48 Second, we focused upon multidimensional screening tools. Otherwise, one may make use of unidimensional questionnaires assessing single psychosocial risk factors to investigate the predictive power of unique psychosocial variables (eg, Pain Catastrophizing Scale113 and Tampa Scale for Kinesiophobia69) for poor pain outcomes. For screening purposes, however, one should aim to minimize the burden of filling out questionnaires for participants. The use of large questionnaire batteries should therefore be avoided. Third, this research field is quickly evolving, with new validation studies appearing at a fast pace. Since our search, new instruments have been validated in an independent study. For instance, the Optimal Screening for Prediction of Referral and Outcome cohort yellow flag assessment tool was developed in a cross-sectional cohort in 2016.58 Recently, a validation study was published.29 Fourth, clinical prediction modelling is a dynamic and evolving field15,47,56,94,108–111 (see also progress-partnership.org). One should keep in mind that the present review is an exploratory mapping of this rapidly evolving field. Assessment of the quality evidence in the included studies was based upon a prepublication version of the PROBAST. This version did not yet provide a guideline for scoring the questions. We constructed, therefore, our own coding system. Now, PROBAST has been published, with some minor changes from the prepublication version of the PROBAST (eg, the signaling questions of the domain “Sample size and participants flow” are now included in the domain Outcomes and the domain Analysis).76,128 Despite this minor changes, the resulting mapping fulfills the primary goal of providing an entry point to reduce risk of bias in this field. Fifth, we did not perform a meta-analysis. Several meta-analyses are available that synthesize the predictive value of screening tools. They indicate that (1) the predictive value of these screening is highly variable depending on the pain outcome of interest (eg, pain and disability) and (2) substantial heterogeneity between studies exist.49,99 Taking into account methodological differences and quality criteria is therefore crucial to further our understanding of the predictive value of screening tools. Our insights have the potential to improve research in this area and decision-making based on this research.
The authors have no conflict of interest to declare.
Preparation of this article was supported by funding from the European Union's Horizon 2020 research and innovation program (Grant 633491).
. Abegglen S, Hoffmann-Richter U, Schade V, Znoj HJ. Work and Health Questionnaire (WHQ): a screening tool for identifying injured workers at risk for a complicated rehabilitation. J Occup Rehabil 2016;27:268–83.
. Althof JE, Beasley BD. Psychosocial management of the foot and ankle surgery patient. Clin Podiatr Med Surg 2003;20:199–211.
. Andersen JH, Haah JP, Frost P. Risk factors for more severe regional musculoskeletal symptoms: a two-year prospective study of a general working population. Arthritis Rheum 2007;56:1355–64.
. Austin PC, Steyerberg EW. Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models. Stat Methods Med Res 2017;26:796–808.
. Beneciuk JM, Bishop MD, Fritz JM, Robinson ME, Asal NR, Nisenzon AN, George SZ. The STarT back screening tool and individual psychological measures: evaluation of prognostic capabilities for low back pain
clinical outcomes in outpatient physical therapy settings. Phys Ther 2013;93:321–33.
. Blyth MF, March ML, Nicholas KM, Cousins JM. Chronic pain
, work performance and litigation. PAIN
. Breivik H, Collett B, Ventafridda V, Cohen R, Gallacher D. Survey of chronic pain
in Europe: prevalence, impact on daily life, and treatment. Eur J Pain
. Broadbent E, Wilkes C, Koschwanez H, Weinman J, Norton S, Petrie KJ. A systematic review and meta-analysis of the Brief Illness Perception Questionnaire. Psychol Health 2015;30:1361–85.
. Cella D, Riley W, Stone A. The Patient-Reported Outcomes Measurement Information System (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. J Clin Epidemiol 2010;63:1179–94.
. Chiarotto A, Boers M, Deyo RA, Buchbinder R, Corbin TP, Costa L, Foster NE, Grotle M, Koes BW, Kovacs FM, Lin CC, Maher CG, Pearson AM, Peul WC, Schoene ML, Turk DC, van Tulder MW, Terwee CB, Ostelo RW. Core outcome measurement instruments for clinical trials in nonspecific low back pain
. Chiarotto A, Ostelo RW, Turk DC, Buchbinder R, Boers M. Core outcome sets for research and clinical practice. Braz J Phys Ther 2017;21:77–84.
. Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Inter Med 2015;162:55–63.
. Crombez G, Eccleston C, Van Damme S, Vlaeyen JWS, Karoly P. Fear-avoidance model of chronic pain
: the next generation. Clin J Pain
. Dagfinrud H, Storheim K, Magnussen LH, Odegaard T, Hoftaniska I, Larsen LG, Ringstad PO, Hatlebrekke F, Grotle M. The predictive validity of the Örebro Musculoskeletal Pain
Questionnaire and the clinicians' prognostic assessment following manual therapy treatment of patients with LBP and neck pain
. Man Ther 2013;18:124–9.
. Debray TPA, Vergouwe Y, Koffijberg H, Nieboer D, Steyerberg EW, Moons KGM. A new framework to enhance the interpretation of external validation studies of clinical prediction models. J Clin Epidemiol 2015;68:279–89.
. Debray TPA, Damen JAAG, Snell KIE, Ensor J, Hooft L, Reitsma JB, Riley RD, Moons KGM. A guide to systematic review and meta-analysis of prediction model performance. BMJ 2017;356:i6460.
. den Boer JJ, Oostendorp RA, Evers AW, Beems T, Borm GF, Munneke M. The development of a screening instrument to select patients at risk of residual complaints after lumbar disc surgery. Eur J Phys Rehabil Med 2010;46:497–503.
. Dunstan DA, Covic T, Tyson GA, Lennie IG. Does the Orebro Musculoskeletal Pain
Questionnaire predict outcomes following a work-related compensable injury? Int J Rehabil Res 2005;28:369–70.
. Eccleston C, Crombez G. Advancing psychological therapies for chronic pain
. F1000Res 2017;6:461.
. Field J, Newell D. Relationship between STarT Back Screening Tool and prognosis for low back pain
patients receiving spinal manipulative therapy. Chiropr Man Therap 2012;20:17.
. Foster NE, Mullis R, Hill JC, Lewis M, Whitehurst DG, Doyle C, Konstantinou K, Main C, Somerville S, Sowden G, Wathall S, Young J, Hay EM; IMPaCT Back Study team. Effect of stratified care for low back pain
in family practice (IMPaCT Back): a prospective population-based sequential comparison. Ann Fam Med 2014;12:102–11.
. Fritz JM, Benecuik JM, George SZ. Relationship between categorization with the STarT back screening tool and prognosis for people receiving physical therapy for low back pain
. Phys Ther 2011;91:722–32.
. Gabel CP, Burkett B, Melloh M. The shortened Örebro Musculoskeletal Screening Questionnaire: evaluation in a work-injured population. Man Ther 2013;18:378–85.
. Gabel CP, Melloh M, Burkett B, Osborne J, Yelland M. The Örebro Musculoskeletal Screening Questionnaire: validation of a modified primary care musculoskeletal screening tool in an acute work injured population. Man Ther 2012;17:554–65.
. Gabel CP, Melloh M, Yelland M, Burkett B, Roiko A. Predictive ability of a modified Örebro Musculoskeletal Pain
Questionnaire in an acute/subacute low back pain
working population. Eur Spine J 2011;20:449–57.
. Gatchel RJ, Polatin PB, Kinney RK. Predicting outcome of chronic back pain
using clinical predictors of psychopathology: a prospective analysis. Health Psychol 1995;14:415–20.
. Gatchel J, Polatin P, Mayer T. The dominant role of psychosocial risk factors in the development of chronic low back pain
disability. Spine 1996;20:2702–9.
. George SZ, Beneciuk JM. Psychological predictors of recovery from low back pain
: a prospective study. BMC Musculoskelet Disord 2015;16:49.
. George SZ, Beneciuk JM, Lentz TA, Wu SS, Dai Y, Bialosky JE, Zeppieri G Jr. Optimal screening for prediction of referral and outcome (OSPRO) for musculoskeletal pain
conditions: results from the validation cohort. J Orthop Sports Phys Ther 2018;48:460–75.
. Gray H, Adefolarin AT, Howe TE. A systematic review of instruments for the assessment of work-related psychosocial factors (Blue Flags) in individuals with non-specific low back pain
. Man Ther 2011;16:531–43.
. Grotle M, Brox JI, Glomsrod B, Lonn JH, Vollestad NK. Prognostic factors in first-time care seekers due to acute low back pain
. Eur J Pain
. Grotle M, Brox JI, Veierød MB, Glomsrød B, Lønn JH, Vøllestad NK. Clinical course and prognostic factors in acute low back pain
. Patients consulting primary care for the first time. Spine 2005;30:976–82.
. Grotle M, Vollestad NK, Brox JI. Screening for yellow flags
in first-time acute low back pain
: reliability and validity of a Norwegian version of the acute low back pain
screening Questionnaire. Clin J Pain
. Gureje O, Von Korff M, Simon GE, Gater R. Persistent pain
and well-being: a World Health Organization study in primary care. JAMA 1998;280:147–51.
. Hahn S, Williamson PR, Hutton JL. Investigation of within-study selective reporting in clinical research: follow-up of applications submitted to a local research ethics committee. J Eval Clin Pract 2002;8:353–9.
. Hayden JA, Cote P, Bombardier C. Evaluation of the quality of prognosis studies in systematic reviews. Ann Intern Med 2006;44:427–37.
. Hayden JA, Van Der Windt DA, Cartwright JL, Cote P, Bombardier C. Assessing bias in studies of prognostic factors. Ann Intern Med 2013;158:280–6.
. Heneweer H, Aufdemkampe G, van Tulder MW, Kiers H, Stappaerts KH, Vanhees L. Psychosocial variables in patients with (sub)acute low back pain
: an inception cohort in primary care physical therapy in the Netherlands. Spine 2007;32:586–92.
. Henschke N, Maher CG, Refshauge KM, Herbert RD, Cumming RG, Bleasel J, York J, Das A, McAuely JH. Prognosis in patients with recent onset low back pain
in Australian primary care: inception cohort study. BMJ 2008;337:1–7.
. Higgins JPT, Altman DG, Sterne JAC, editors. Chapter 8: assessing risk of bias
in included studies. In: Higgins JPT, Green S, editors. Cochrane handbook for systematic reviews of interventions version 5.1.0 [updated March 2011]. The Cochrane Collaboration, 2011. Available at: http://www.cochrane-handbook.org.Accessed
January 26, 2017.
. Hill JC, Dunn KM, Lewis M, Mullis R, Main CJ, Foster NE, Hay EM. A primary care back pain
screening tool: identifying patient subgroups for initial treatment. Arthritis Rheum 2008;59:632–41.
. Hockings RL, McAuley JH, Maher CG. A systematic review of the predictive ability of the Orebro Musculoskeletal Pain
Questionnaire. Spine 2008;33:E494–500.
. Hurley DA, Dusoir TE, McDonough SM, Moore AP, Linton SJ, Baxter GD. Biopsychosocial screening questionnaire for patients with low back pain
: preliminary report of utility in physiotherapy practice in Northern Ireland. Clin J Pain
. Hurley DA, Dusoir TE, McDonough SM, Moore AP, Baxter GD. How effective is the acute low back pain
screening questionnaire for predicting 1-year follow-up in patients with low back pain
? Clin J Pain
. Iles RA, Davidson M, Taylor NF. Psychosocial predictors of failure to return to work in non-chronic non-specific low back pain
: a systematic review. J Occup Environ Med 2008;65:507–17.
. Janssen KJM, Kalkman CJ, Grobbee D, Bonsel GJ, Moons KGM, Vergouwe Y. The risk of severe postoperative pain
: modification and validation of a clinical prediction rule. Anesth Analg 2008;107:1330–9.
. Justice AC, Covinsky KE, Berlin JA. Assessing the generalizability of prognostic information. Ann Intern Med 1999;130:515–24.
. Kalkman CJ, Visser K, Moen J, Bonsel GJ, Grobbee DE, Moons KG. Preoperative prediction of severe postoperative pain
. Karran EL, McAuley JH, Traeger AC, Hillier SL, Grabherr L, Russek LN, Moseley GL. Can screening instruments accurately determine poor outcome risk in adults with recent onset low back pain
? A systematic review and meta-analysis. BMC Med 2017;15:13.
. Karran EL, Traeger AC, McAuley JH, Hillier SL, Yau Y, Moseley GL. The value of prognostic screening for patients with low back pain
in secondary care. J Pain
. Kendall NAS, Linton SJ, Main CJ. Guide to assessing psychosocial yellow flags
in acute low back pain
: Risk factors for long term disability and work loss. Wellington: Accident Rehabilitation and Compensation Insurance Corporation of New Zealand and the National Health Committee, 1997.
. Khan KS, Kunz R, Kleijnen J, Antes G. Five steps to conducting a systematic review. J R Soc Med 2003;96:118–21.
. Kongsted A, Andersen CH, Hansen MM, Hestbaek L. Prediction of outcome in patients with low back pain
—a prospective cohort study comparing clinicians' predictions with those of the Start Back Tool. Man Ther 2016;21:120–7.
. Lang K, Alexander IM, Simon J, Sussman M, Lin I, Menzin J, Friedman M, Dutwin D, Bushmakin AG, Thrift-Perry M, Altomare C, Hsu MA. The impact of multimorbidity on quality of life among midlife women: findings from a U.S. nationally representative survey. J Womens Health 2015;24:374–83.
. Law RKY, Lee EWC, Law SW, Chan BKB, Chen PP, Szeto GPY. The predictive validity of OMPQ on the rehabilitation outcomes for patients with acute and subacute non-specific LBP in a Chinese population. J Occup Rehabil 2013;23:361–70.
. Lee YH, Bang H, Kim DJ. How to establish clinical prediction models. Endocrinol Metab 2016;31:38–44.
. Leeuw M, Goossens MEJB, Linton SJ, Crombez G, Boersma K, Vlaeyen JW. The fear-avoidance model of musculoskeletal pain
: current state of scientific evidence. J Behav Med 2007;30:77–94.
. Lentz TA, Beneciuk JM, Bialosky JE, Zeppieri G Jr, Dai Y, Wu SS, George SZ. Development of a yellow flag assessment tool for orthopaedic physical therapists: results from the optimal screening for prediction of referral and outcome (OSPRO). J Orthop Sports Phys Ther 2016;5:327–43.
. Leysen M, Nijs J, Meeus M, Wilgen CP, Struyf F, Vermandel A, Kuppens K, Roussel N. Clinimetric properties of illness perception questionnaire revised (IPQ-R) and brief illness perception questionnaire (Brief IPQ) in patients with musculoskeletal disorders: a systematic review. Man Ther 2014;20:10–17.
. Linton SJ. A review of psychological risk factors in back and neck pain
. Spine (Phila Pa 1976) 2000;25:1148–56.
. Linton SJ, Boersma K. Early identification of patients at risk of developing a persistent back problem: the predictive validity of the Orebro Musculoskeletal Pain
Questionnaire. Clin J Pain
. Linton SJ, Hallden K. Can we screen for problematic back pain
? A screening questionnaire for predicting outcome in acute and subacute back pain
. Clin J Pain
. Linton SJ, Nicolas M, MacDonald S. Development of a short form of the Örebro musculoskeletal pain
screening Questionnaire. Spine 2011;36:1891–5.
. Maher CG, Grotle M. Evaluation of the predictive validity of the Orebro Musculoskeletal Pain
Screening Questionnaire. Clin J Pain
. Main C, Wood P, Hollis S, Spanswick CC, Waddell G. The Distress and Risk Assessment Method. A simple patient classification to identify distress and evaluate the risk of poor outcome. Spine 1992;17:42–52.
. Margison DA, French DJ. Predicting treatment failure in the subacute injury phase using the Orebro Musculoskeletal Pain
Questionnaire: an observational prospective study in a workers' compensation system. J Occup Environ Med 2007;49:59–67.
. Marhold C, Linton SJ, Melin L. A cognitive–behavioral return-to-work program: effects on pain
patients with a history of long-term versus short-term sick leave. PAIN
. Melloh M, Elfering A, Egli Presland C, Roeder C, Barz T, Rolli Salathé C, Tamcan O, Mueller U, Theis JC. Identification of prognostic factors for chronicity in patients with low back pain
: a review of screening instruments. Int Orthop 2009;33:301–13.
. Miller RP, Kori SH, Todd DD. The Tampa Scale: a measure of kinisophobia. Clin J Pain
. Moerman N, van Dam FS, Muller MJ, Oosting H. The Amsterdam preoperative anxiety and information scale (APAIS). Anesth Analg 1996;82:445–51.
. Moher D, Liberati A, Tetzlaff J, Altman DG; PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med 2009;6:e1000097.
. Montori VM, Permanyer-Miralda G, Ferreira-González I, Busse JW, Pacheco-Huergo V, Bryant D, Alonso J, Akl EA, Domingo-Salvany A, Mills E, Wu P, Schünemann HJ, Jaeschke R, Guyatt GH. Validity of composite end points in clinical trials. BMJ 2005;330:594–6.
. Moons KG, Altman DG, Reitsma JB, Ioannidis JP, Macaskill P, Steyerberg EW, Vickers AJ, Ransohoff DF, Collins GS. Transparent reporting of a multivariable prediction model for individual prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med 2015;162:W1–W73.
. Moons KG, de Groot JA, Bouwmeester W, Vergouwe Y, Mallett S, Altman DG, Reitsma JB, Collins GS. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PLoS Med 2014;11:e1001744.
. Moons KGM, Kengne AP, Grobbee DE, Royston P, Vergouwe Y, Altman DG, Woodward M. Risk prediction models: II. External validation, model updating, and impact assessment. Heart 2012;98:691–8.
. Moons KG, Wolff RF, Riley RD, Whiting PF, Westwood M, Collins GS, Reitsma JB, Kleijnen J, Mallett S. PROBAST: a tool to assess risk of bias
and applicability of prediction model studies: explanation and elaboration. Ann Intern Med 2019;170:W1–W33.
. Morsø L, Kent P, Albert HB, Hill JC, Kongsted A, Manniche C. The predictive and external validity of the STarT Back Tool in Danish primary care. Eur Spine J 2013;22:1859–67.
. Morsø L, Kent P, Manniche C, Albert HB. The predictive ability of the STarT Back Screening Tool in a Danish secondary care setting. Eur Spine J 2014;23:120–8.
. Neubauer E, Junge A, Pirron P, Seemann H, Schiltenwolf M. HKF-R 10—screening for predicting chronicity in acute low back pain
(LBP): a prospective clinical trial. Eur J Pain
. Newell D, Field J, Pollard D. Using the STarT back tool: does timing of stratification matter? Man Ther 2015;20:533–9.
. Nonclercq O, Berquin A. Predicting chronicity in acute back pain
: validation of a French translation of the Orebro Musculoskeletal Pain
Screening Questionnaire. Ann Phys Rehabil Med 2012;55:263–78.
. Pajouheshnia R, Damen JAAG, Groenwold R, Moons KM, Peelen LM. Treatment use in prognostic model research: a systematic review of cardiovascular prognostic studies. Diagn Progn Res 2017;1:15.
. Pajouheshnia R, Peelen LM, Moons K, Reitsma JB, Groenwold R. Accounting for treatment use when validating a prognostic model: a simulation study. BMC Med Res Methodol 2017;17:103.
. Peat G, Riley RD, Croft P, Morley KI, Kyzas PA, Moons KG, Perel P, Steyerberg EW, Schroter S, Altman DG, Hemingway H; PROGRESS Group. Improving the transparency of prognosis research: the role of reporting, data sharing, registration, and protocols. PLoS Med 2014;11:e1001671.
. Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol 1996;49:1373–9.
. Pengel LHM, Herbert RD, Maher CG, Refshauge KM. Acute low back pain
: systematic review of its prognosis. BMJ 2003;327:323.
. Phillips CJ. The cost and burden of chronic pain
. Rev Pain
. Pincus T. A systematic review of psychological factors as predictors of chronicity/disability in prospective cohorts of low back pain
. Spine 2002;27:E109–20.
. Porter ME, Larsson S, Lee TH. Standardizing patient outcomes measurement. N Engl J Med 2016;374:504–6.
. Ramond A, Bouton C, Richard I, Roquelaure Y, Baufreton C, Legrand E, Huez JF. Psychosocial risk factors for chronic low back pain
in primary care—a systematic review. J Fam Pract 2011;28:12–21.
. Reitsma JB, Rutjes AWS, Whiting P, Vlassov VV, Leeflang MMG, Deeks JJ. Chapter 9: assessing methodological quality. In JJ Deeks, PM Bossuyt, C Gatsonis, editors. Cochrane handbook of systematic reviews of diagnostic test accuracy, version 1.0.0. The Cochrane Collaboration, 2009. Available at: http://srdta.cochrane.org.Accessed
January 26, 2017.
. Riewe E, Neubauer E, Pfeifer AC, Schiltenwolf M. Predicting persistent back symptoms by psychosocial risk factors: validity criteria for the ÖMPSQ and the HKF-r 10 in Germany. PLoS One 2016;11:e0158850.
. Rodeghero JR, Cook CE, Cleland JA, Mintken PE. Risk stratification of patients with low back pain
seen in physical therapy practice. Man Ther 2015;20:855–60.
. Royston P, Moons KGM, Altman DG, Vergouwe Y. Prognosis and prognostic research: developing a prognostic model. BMJ 2009;338:1373–7.
. Saastamoinen P, Leino-Arjas P, Laaksonen M, Lahelma E. Socio-economic differences in the prevalence of acute, chronic and disabling chronic pain
among ageing employees. PAIN
. Sackett DL, Straus SE, Richardson WS, Rosenberg W, Haynes RB. Evidence-based medicine: how to practice and teach EBM. Edinburgh: Churchill Livingstone, 2000.
. Sandborgh M, Lindberg P, Denison E. Pain
belief screening instrument: development and preliminary validation of a screening instrument for disabling persistent pain
. J Rehabil Med 2007;39:461–6.
. Sandborgh M, Lindberg P, Denison E. The Pain
Belief Screening Instrument (PBSI): predictive validity for disability status in persistent musculoskeletal pain
. Disabil Rehabil 2008;30:1123–30.
. Sattelmayer M, Lorenz T, Röder C, Hilfiker R. Predictive value of the Acute Low Back Pain
Screening Questionnaire and the Örebro Musculoskeletal Pain
Screening Questionnaire for persisting problems. Eur Spine J 2011;21(suppl 6):S773–84.
. Schultz IZ, Crook J, Berkowitz J, Milner R, Meloche GR. Predicting return to work after low back injury using the psychosocial risk for occupational disability instrument: a validation study. J Occup Rehabil 2005;15:365–76.
. Scott W, McCracken L. Psychological assessment to identify patients at risk of postsurgical pain
: the need for theory and pragmatism. Br J Anaesth 2016;117:546–8.
. Shahidi B, Curran-Everett D, Maluf KS. Psychosocial, physical, and neurophysiological risk factors for chronic neck pain
: a prospective inception cohort study. J Pain
. Shaw WS, Chin EH, Nelson CC, Reme SE, Woiszwillo MJ, Verma SK. What circumstances prompt a workplace discussion in medical evaluations for back pain
? J Occup Rehabil 2013;23:125–34.
. Shaw WT, Pransky GS, Patterson WB, Winters T. Early disability risk factors for low back pain
assessed at outpatient occupational health clinics. Spine 2005;30:572–80.
. Shaw WS, Reme SE, Pransky G, Woiszwillo MJ, Steenstra IA, Linton SJ. The pain
recovery inventory of concerns and expectations a psychosocial screening instrument to identify intervention needs among patients at elevated risk of back disability. J Occup Environ Med 2013;55:885–94.
. Sobol-Kwapinska M, Bąbel P, Plotek W, Stelcer B. Psychological correlates of acute postsurgical pain
: a systematic review and meta-analysis. Eur J Pain
. Steenstra I, Verbeek J, Heymans M, Bongers P. Prognostic factors for duration of sick leave in patients sick listed with acute low back pain
: a systematic review of the literature. Occup Environ Med 2005;62:851–60.
. Steyerberg EW. Clinical prediction models: A practical approach to development, validation, and updating. New York: Springer, 2009.
. Steyerberg EW, Moons KG, van der Windt DA, Hayden JA, Perel P, Schroter S, Riley RD, Hemingway H, Altman DG; PROGRESS Group. Prognosis Research Strategy (PROGRESS) 3: prognostic model research. PLoS Med 2013;10:e1001381.
. Steyerberg EW, Vergouwe Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur Heart J 2014;35:1925–31.
. Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, Pencina MJ, Kattan MW. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology 2010;21:128–38.
. Streibelt M, Bethge M. Prospective cohort analysis of the predictive validity of a screening instrument for severe restrictions of work ability in patients with musculoskeletal disorders. Am J Phys Med Rehabil 2015;94:617–26.
. Sullivan M, Bishop SR, Pivik J. The pain
catastrophizing scale: development and validation. Psychol Assess 1995;7:524–32.
. Toth C, Lander J, Wiebe S. The prevalence and impact of chronic pain
with neuropathic pain
symptoms in the general population. Pain
. Traeger AC, Henschke N, Hübscher M, Williams CM, Kamper SJ, Maher CG, Moseley GL, McAuley JH. Estimating the risk of chronic pain
: development and validation of a prognostic model (PICKUP) for patients with acute low back pain
. PLoS Med 2016;13:e1002019.
. Truchon M, Schmouth ME, Cote D, Fillion L, Rossignol M, Durand MJ. Absenteeism screening questionnaire (ASQ): a new tool for predicting long-term absenteeism among workers with low back pain
. J Occup Rehabil 2012;22:27–50.
. Tu YK, Gilthorpe MS. Revisiting the relation between change and initial value: a review and evaluation. Stat Med 2007;26:443–57.
. Tucker CA, Cieza A, Riley AW, Stucki G, Lai JS, Bedirhan Ustun T, Kostanjsek N, Riley W, Cella D, Forrest CB. Concept analysis of the patient reported outcomes measurement information system (PROMIS®) and the International classification of functioning, disability and Health (ICF). Qual Life Res 2014;23:1677.
. Tucker CA, Escorpizo R, Cieza A, Lai JS, Stucki G, Ustun TB, Kostanjsek N, Cella D, Forrest CB. Mapping the content of the patient-reported outcomes measurement information system (PROMIS®) using the International classification of functioning, Health and disability. Qual Life Res 2014;23:2431–8.
. Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making 2006;26:565–74.
. Vickers AJ, Van Calster B, Steyerberg EW. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ 2016;352:i6.
. Vlaeyen JWS, Linton SJ. Fear-avoidance and its consequences in chronic musculoskeletal pain
: a state of the art. PAIN
. Vlaeyen JWS, Morley S, Crombez G. The experimental analysis of the interruptive, interfering, and identity-distorting effects of chronic pain
. Behav Res Ther 2016;86:23–34.
. Vos CJ, Verhagen AP, Koes BW. The ability of the acute low back pain
screening Questionnaire to predict sick leave in patients with acute neck pain
. J Manipulative Physiol Ther 2009;32:178–83.
. Walton DM, Krebs D, Moulden D, Wade P, Levesque L, Elliott J, MacDermid JC. The traumatic Injuries distress scale: a new tool that quantifies distress and Has predictive validity with patient-reported outcomes. J Orthop Sports Phys Ther 2016;46:920–8.
. Wingbermuhle RW, van Trijffel E, Nelissen PM, Koes B, Verhagen AP. Few promising multivariable prognostic models exist for recovery of people with non-specific neck pain
in musculoskeletal primary care: a systematic review. J Physiother 2018;64:16–23.
. Wolff RF, Moons KG, Riley RD, Whiting PF, Westwood M, Collins GS, Reitsma JB, Kleijnen J, Mallett S; PROBAST Group. PROBAST: a tool to assess the risk of bias
and applicability of prediction model studies. Ann Intern Med 2019;170:51–8.
. Wolff R, Whiting P, Mallet S, Riley R, Westwood M, Kleijnen K, Mallet S. PROBAST—a risk-of-bias tool for prediction-modelling studies. Abstracts of the global evidence summit, Cape Town, South Africa. Cochrane Database Syst Rev 2017;9(suppl 1).
. World Health Organization. International classification of functioning, disability and health (ICF) Geneva: World Health Organization; 2001.