The wave of value-based health care has brought new emphasis on the importance of patient perspectives in care, particularly that treatments provide meaningful changes or improvements as seen from the patient’s point of view . The National Institutes of Health has sponsored the Patient-Reported Outcomes Measurement Information System (PROMIS) to promote the large-scale development and standardization of patient-reported outcome (PRO) instruments that capture the patient perspective. The PROMIS Physical Function (PF) and Pain Interference (PI) instruments are increasingly used in orthopaedic practice as a result of their relevance and ease of use [7, 12]. As new testing instruments are incorporated into spine care, it is important to understand the value of the newer instruments compared with the previous gold standard measures.
The value of new instruments versus time-tested measures is dependent on whether they offer psychometric improvements, ease of use, and reduced respondent burden. It is also important that the new instruments are clinically relevant, and this means that they must detect changes in the patient’s condition. If a measure detects change over time (responsiveness), it is then possible to determine which score values reflect meaningful change, which is a clinically important indicator. The minimum clinically important difference (MCID) is a measure of the smallest amount of change (as measured by the instrument) that is meaningful from the patient’s perspective  or changes resulting from clinical intervention .
Meaningful change is a critical benchmark of treatment effectiveness and investigative interpretations of treatment outcomes . There are two primary methods for determining an instrument’s MCID. Distribution-based methods use statistical calculations based on the variability in scores across the population to calculate the smallest amount of change that can be considered true change or beyond random fluctuation . A second approach is to anchor the change in patient scores to some other measure of the patient’s change in condition, referred to as an anchor-based method. An anchor might be the patient’s response to a query such as: “Compared with your first visit, how would you describe your physical function now?” The anchor-based method calculates what is beyond chance or random fluctuations by comparing the patient change in scores with the anchor from that same patient, whereas distribution methods focus on how much scores vary between patients [34, 48].
Determining the MCID of an instrument requires repeated measures over time because there must be documented change from some initial point before assessing the importance of the change. It takes time to collect this information, particularly as the new PROMIS instruments have been introduced into clinical care. MCID value determination for the PROMIS PF has only recently begun in a few specific patient populations, including pediatrics , cancer , and multiple sclerosis , but no previous determination of MCID values for the PROMIS PF have been published in spinal populations. The PROMIS PI has been evaluated in a low back pain population and yielded a MCID of 3.5 to 5.5 points .
It is useful to consider meaningful scores of newer instruments alongside instruments in longer term use. The Oswestry Disability Index (ODI) and the Neck Disability Index (NDI) are two long-used instruments in orthopaedic spine clinics [19, 33]. The ODI has shown good-to-fair psychometric properties when validated both with classic test theory [10, 32, 42] and with the modern item response theory approach . Studies on the MCID values for the ODI have yielded values ranging from a 6- to 10-point change [13, 17, 32]. The NDI is the most widely used PRO instrument for neck disorders and has shown reasonable test-retest reliability, but its overall psychometric properties are questionable [18, 31, 50]. The MCID for the NDI ranges from 5 to 10 points but has generally been regarded as inconsistent across studies . Research shows that the PROMIS PF and PI have moderate-to-strong correlations with the NDI and ODI in a spinal population . Additionally, the PROMIS instruments have been shown to outperform the ODI in spine patients .
The purpose of the present study was to establish a comprehensive repository of MCID values calculated using both distribution-based and anchor-based methods for four outcomes instruments in spine care. We asked: (1) What are the MCIDs of the PROMIS PF among spine patients? (2) What are the MCIDs of the PROMIS PI among spine patients? (3) What are the MCIDs of the NDI among spine patients? (4) What are the MCIDs of the ODI among spine patients?
Patients and Methods
Study Design and Setting
We conducted a prospective study of previously tested diagnostic measures on 1945 consecutive patients with a reference standard applied. Before their visit with a spine specialist at a university orthopaedic clinic, patients completed demographic, PF, and PI questionnaires on handheld tablets as part of the standard and routine clinic care protocol. All patients aged 18 years and older seen consecutively at the clinic by spine specialists between October 2013 and January 2017 who had at least one followup visit with a questionnaire were included in the analysis. No other exclusions were applied. Clinic-wide approximately 1% to 2% of patients refuse to complete survey questions as part of their standard clinic care. MCID determination is a guide for meaningful score interpretation; thus, it is helpful to determine MCIDs on as generalizable a patient population as possible. The MCID determination is not meant to be treatment-specific , and evidence does not suggest that MCID scores are dependent on the length of time since treatment or the severity of the patient’s condition [23, 35, 45]. Thus, patients with a wide range of spine conditions and multiple treatment types including vertebral process/body fractures and removal procedures on the musculoskeletal system were included in the patient sample.
The institutional review board of the University of Utah approved the study protocol.
Variables, Outcome Measures, Data Sources, and Bias
Patients were administered the PROMIS PF version 1.2, the PROMIS PI version 1.1, the NDI, and the ODI. The PROMIS PF version 1.2 draws from a 121-item test bank containing lower extremity and upper extremity items. The PROMIS PI version 1.1 has a 40-item bank. A computerized adaptive test (CAT) was used to administer the PROMIS PF and PI. The CAT is administered through a web-based portal known as the Assessment Center, which was established by PROMIS developers ; patients access it remotely with handheld tablets. By structuring question selection from the item bank that adjusts based on prior patient answers, CAT test administrations minimize patient burden. For example, patients would not be asked, “Are you able to push open a heavy door?” if they had previously answered “unable to do” to a task requiring less strength or stamina. Lower scores on the PROMIS PF demonstrate lower patient functioning, whereas smaller scores on the PROMS PI indicate patients experiencing lower pain and interference. Both instrument scores are standardized T-scores with a mean of 50 and SD of 10 and are calibrated in the general population .
Both the NDI and ODI contain 10 items that are related to the neck and low back and were administered depending on the location of the patient’s chief complaint. Patients were administered all items on the NDI and ODI. Scores on the NDI and ODI range from 0 to 100 with lower scores representing higher functioning levels and minimal disability. All four instruments were administered at baseline (ie, either within 7 days before the clinic visit for a new spinal condition or on the day of the first clinic visit) and at each followup, regardless of timing.
Anchor questions included in the MCID estimation for PF were based on the following question: “Compared with your FIRST EVALUATION at the xxx, how would you describe your physical function now? (much worse, worse, slightly worse, no change, slightly improved, improved, much improved).” For pain interference, the anchor question was based on the following question: “Compared with your FIRST EVALUATION at the xxx, how would you describe your episodes of PAIN now?” Meaningful level of change is inferred based on change related to patient perceptions as a self-report anchor .
The MCID determinations were based on groupings of patient followup testing categorized into four different and overlapping periods: (1) 3-month followup (ie, 80-100 days after the initial assessment); (2) > 3-month followup (ie, 90 days or more after the initial assessment); (3) 6-month followup (ie, 170-190 days after the initial assessment); and (4) > 6-month followup (ie, 180 days or more after the initial assessment) as common followup points in orthopaedic practices [6, 9, 16, 21, 25, 28-30, 36, 43, 47]. Each point may have included different patients from the sample and data from the same patient may have been included in the analysis multiple times based on when patients were seen again in the clinic after their baseline appointments.
We used descriptive statistics to examine sample characteristics and demographics. We measured PF and PI at four different followup periods and individually compared them with the baseline scores of the PROMIS PF, PROMIS PI, NDI, and ODI. Mean change was calculated as the absolute value difference between the baseline score and the 3-month followup score, the > 3-month followup score, the 6-month followup score, and the > 6-month followup scores for all four PRO instruments.
Statistical analyses involving anchor-based methods included mean change scores and receiver operating characteristic (ROC) curves and values. Methods commonly used to distinguish change from no change were used for inclusion criteria in the anchor-based analysis for patients only experiencing change (much worse, worse, improved, much improved) [11, 38]. By combining improved and deteriorated conditions, we drew a distinction between those with stable versus changing conditions using the symmetry of absolute value change scores . The ROC area under the curve was used to identify the discriminative ability of a scale or instrument. The ROC cutoff was based on Youden’s J value and identified as the maximized sensitivity plus specificity, calculated as (sensitivity + specificity) – 1 . The specificity calculation included patients correctly identified as having no meaningful change divided by all patients with no meaningful change. The number of patients correctly identified as changed divided by all patients with meaningful change was included in the sensitivity calculation.
In addition, we used three different distribution-based methods based on the patient’s followup score to determine MCID. The first two methods calculated 1/2 SD and 1/3 SD of scores. The third distribution-based method included calculating the minimum detectable change (MDC) at 90%, 95%, and 99%, where the 90% values are less precise and yield a smaller MDC and the 99% values have the greatest precision and a higher MDC. MDC is a calculation of change that falls outside the measurement error of an instrument, distinct from the broad construct of MCID as meaningful change, which can be calculated by multiple methods . Estimation of MDCs was based on the standard error of measurement (SEM) where: MDC@90% = 1.65 * *SEM; MDC@95% = 1.96* *SEM; MDC@99% = 2.56 * *SEM. Calculation for the SEM was based on the following formula: SD*; r is the reliability represented by Cronbach α.
We used SPSS 24.0 for Windows (IBM Corp, Armonk, NY, USA) and R 3.30 (R Development Core Team, R Foundation for Statistical Computing, Vienna, Austria) to run all analyses.
Demographics and Description of the Study Population
Between October 2013 and January 2017, 1945 patients aged 18 years and older visited the clinic for spine conditions. The mean age of participants was 58 years (SD = 15; range, 18-95 years). The sample was 51% (988 of 1945) men, 90% (1754 of 1945) self-identified as white, and 5% (94 of 1945) as Hispanic (Table 1). Patients were treated with a range of procedures, including musculoskeletal system removal procedures, vertebral joint aspirations, epidural injections at vertebral levels, vertebral fractures, anesthetics, spinal radiology, electromyography, and other miscellaneous procedures. More patients reported meaningful change than no change at every time point except those taking the NDI at 6-month followup (Table 2).
What Are the MCIDs of the PROMIS PF Among Spine Patients?
The anchor-based MCID of mean change scores on the PROMIS PF ranged from 7 (SD = 5) to 8 (SD = 8) across the time points for those experiencing meaningful change. Using the ROC cutoff, the MCID for the PROMIS PF ranged from 4 to 10 points across the four followup periods (see Table 2 for anchor-based MCID values). SD distribution-based MCID calculations for the PROMIS PF yielded a range of 3 points (1/3 SD at 3-month followup) to 5 points (1/2 SD at 6-month followup). The MDC ranged from 12 points (MDC@90% at 3-month followup) to 23 points (MDC@99% at 6-month followup) (see Table 3 for distribution-based MCID values).
What Are the MCIDs of the PROMIS PI Among Spine Patients?
For the PROMIS PI, the anchor-based MCID of mean change ranged from 8 (SD = 8) to 9 points (SD = 7) in the changed group across the time points. The ROC MCIDs for the PROMIS PI ranged from 1 to 7 across followup periods. The distribution-based SD MCIDs for the PROMIS PI ranged from 3 points (1/3 SD at 6 months followup) to 4 points (1/2 SD at > 3 months followup). The MDC scores ranged from 14 (MDC@90% at > 6 months followup) to 24 (MDC@99% at 6 months).
What Are the MCIDs of the NDI Among Spine Patients?
For the NDI, the MCID of mean change ranged from 13 (SD = 11) to 18 (SD = 14). The ROC MCIDs for the NDI ranged from 6 to 22 across time points. The SD MCIDs ranged from 6 points (1/3 SD at 3 months followup) to 10 points (1/2 SD at > 3 months followup). MDC values ranged from 23 (MDC@90% at > 3 months) to 43 (MDC@99% at 3 months) for the NDI.
What Are the MCIDs of the ODI Among Spine Patients?
For the ODI, MCID of mean change ranged from 17 (SD = 13) to 19 (SD = 14) points for the changed group. The ROC MCIDs ranged from 18 to 29 for the ODI. The SD MCIDs ranged from 7 points (1/3 SD at 3 months followup) to 10 points (1/2 SD at 6 months followup). The MDC values ranged from 30 (MDC@90% at 3 months) to 51 (MDC@99% at 6 months followup) (see Fig. 1 for a summary of all anchor-based and distribution-based MCID values for each measure).
Patient-reported outcomes score interpretation is important, particularly because PROs are linked to healthcare expenditures and patient satisfaction . Understanding not only the treatment-related changes in PRO scores, but how the score changes specifically relate to the patients’ experiences is vital. The MCID provides this context for the clinical value of the score change noted after treatment and for a given PRO measure provides the minimum score change where a patient has noted a substantial clinical change. Values for MCIDs can vary, depending on how they were calculated, and should be thought of as a range rather than a static number for a given measure. Generally, for low-risk treatment effects, a lower MCID value within the given range may be adequate, whereas the study of high-risk effects may warrant a more stringent MCID value.
The current study identifies the range of MCIDs, using a variety of calculation methods, for four PROs: the PROMIS PF and PI, the NDI, and the ODI, all now commonly used in the care of patients with spinal disorders. Clinical research on the MCID of PROMIS PF and PI has only recently begun [2, 46, 49] and to our knowledge has not been evaluated in a spine population. Prior research has determined the MCIDs of the ODI and NDI in a spine population [13, 17, 31, 32], but these have not been evaluated with comprehensive analysis methods and have not been presented in relation to MCID values of PROMIS instruments. By providing a wide range of methods and cutoff points for determining the MCID of these four instruments, this study offers clinicians a range of interpretive options. A range of MCIDs is important from a methodological standpoint  as well as for clinical use, because different clinical presentations warrant either more or less stringent change criteria.
This study of a spine patient population determined 28 MCID values for each instrument. Currently there is no agreed-on method for determining MCID despite more than three decades of investigation [8, 14]. This comprehensive analytic approach allows for true triangulation of results rather than producing a single unstable MCID value  as is often seen in other research that uses just one approach to estimate MCID values. MCID values ranged from 2 to 23 points for the PROMIS PF, 3 to 24 points for the PROMIS PI, 6 to 43 points for the NDI, and 7 to 47 points on the ODI across the four time points. The 6-point smallest end of the MCID range for the NDI and ODI is consistent with prior research that demonstrated MCID values ranging from 5 to 10 points [13, 17, 31, 32]. The large scores at the upper end of the range likely reflect the stricter cutoff points applied with the MDC@95-99%, a level of precision not analyzed in the studies cited. However, even the less strict cutoff of MDC@90% produced MCID values for the ODI over 23 points, much higher than 8- to 11-point MCID values identified in prior research . Similarly, the smallest MCID values determined here for the PROMIS PI (6 points) are similar to the 3.5- to 5.5-point published MCID values in patients with low back pain , yet the more precise MCID values for the PROMIS PI, which ranged up to 24 points, are much higher, reflecting the precision of the method.
There are several potential reasons for the larger MCID values identified in this study. The specific methods for calculating the MDC value (one of the MCID determination methods) can vary with the SEM taken either from the entire patient population or only the unchanged group . Additionally, the measurement properties of instruments can be influenced by several sources of variability such as sample characteristics and methods of administration . For the ODI, MDC@90 values ranging from 8 to 11 points were identified in a spinal surgery referral patient population . Other published research did not use longitudinal test-retest data, but rather used the reliability coefficients from prior studies to calculate the MDC estimates producing a narrow range of scores . It is also important to note that the greatest differences in MCID values were the result of the different methods applied. In our analysis, the distribution-based SD method yielded the smallest values for each instrument followed by both anchor-based methods (mean change and ROC) with MDCs yielding the largest MCID values across the board.
Oftentimes, MCID is reported as if there is no error and no variation in measurement precision, which is incorrect. MCID perceived as a single value is a position that cannot be supported by the current evidence and lack of measurement consensus. The specific anchor question applied or the level of precision used had effects on the estimated MCIDs in these analyses. Within any one method, the values obtained at the different time points were quite similar. Thus, it is important to reduce methodological bias by using comprehensive multiple methods as was done in the current study. Future research should continue to investigate the application of MCIDs determined by both anchor-based and distribution-based methods to evaluate the suitability of the ranges to clinical treatments and outcomes as well as establishing best practices for determining MCID values.
The PROMIS instruments also showed a wide range of MCID values depending on the method applied. For the PROMIS PF, the range varied almost sevenfold, from < 3 points to > 20 points. Similarly, the PROMIS PI MCID values in this population had a wide range from < 1 point to > 23 points. This finding is consistent with prior research, which shows a lack of agreement between MCID values calculated with differing methods . This wide range of MCID values supports calls to standardize MCID analysis and determine best practices and agreement in MCID methods . We believe anchor-based methods are most useful for practical applications. Methodologically, the distribution-based MCIDs are useful in revealing the precision necessary to make sense of outcomes and are particularly relevant in research and development.
Our study had several limitations. We did not have a sufficiently large sample size to allow stratifying by condition. Data collection for MCID analysis requires repeated measures from the same subjects, and time is needed to develop a pool of sufficient size to look at outcomes within specific diagnostic groupings. This study, along with study at 1-year followup, will be conducted as the data are available. In addition, factors related to the nature of medical care would suggest that patients seeking followup may have needs and conditions that differ in severity or type from the full population of orthopaedic patients. There is some research that indicates that condition impacts the MCID of instruments with as much as a 10-point MCID score difference in patients undergoing hand surgery taking the Quick Disabilities of the Arm, Shoulder and Hand . However, other research suggests MCID determination is not overly dependent on disease severity [35, 45]. Whether or not the clinic-wide sample characteristics impacted MCID findings will be evaluated in planned future studies.
The analysis of change was based on the response to an anchor question, a global rating of change, as the foundation for grouping participants into change and no change categories. This type of question relies on retrospective recall and can be subject to recall bias. Recall bias may be more of an issue as followup periods extend out longer, and the MCIDs presented here should be interpreted relative to those sample characteristics. Although this is a limitation of anchor-based approaches , distribution-based methods are limited by the variance of the data in determining MCID values, potentially making the interpretation of scores difficult. Thus, including both anchor- and distribution-based methods provides a comprehensive picture of the potential MCID range . A final limitation is related to the patient characteristics in the sample, which may not be representative of the US population demographics and should not be generalized beyond the sample characteristics. However, the sensitivity of these measures in detecting meaningful change drawn from this spine orthopaedic sample, including the range of MCID values presented, is likely characteristic of similar heterogeneity in other orthopaedic clinic patient populations and can be generalizable to similar practices.
This research establishes the MCIDs of the PROMIS PF, PROMIS PI, ODI, and NDI in a population of patients with spinal disorders and compares those values using multiple anchor- and distribution-based methods. The MCID values varied widely for all four instruments, depending on the methods applied. We recommend that the most appropriate MCID range be selected based on patient treatment and outcome goals. The low end of the range may be relevant for screening purposes or for low-risk determinations of effect such as referral to physical therapy. However, the median of the MCID range may be a more relevant cutoff for determining higher risk effects such as recommendation for a return to strenuous labor after back surgery or making conclusions about effects of a new treatment being researched. The median MCID values of 8 for the PROMIS PF and PI, 18 for the NDI, and 24 for the ODI may have the necessary precision for research and high-risk decision-making, whereas the smaller SD and anchor-based MCID values may be useful in practical applications advising patients on expected change levels. These results stress the importance of applying value judgment and considering outcome goals in using MCIDs to guide treatment decisions.
1. Amtmann D, Kim J, Chung H, Askew RL, Park R, Cook KF. Minimally important differences for Patient Reported Outcomes Measurement Information System pain interference for individuals with back pain. J Pain Res. 2016;9:251–255.
2. Askew RL, Kim J, Chung H, Cook KF, Johnson KL, Amtmann D. Development of a crosswalk for pain interference measured by the BPI and PROMIS pain interference short form. Qual Life Res. 2013;22:2769–2776.
3. Beaton DE, van Eerd D, Smith P, van der Velde G, Cullen K, Kennedy CA, Hogg-Johnson S. Minimal change is sensitive, less specific to recovery: a diagnostic testing approach to interpretability. J Clin Epidemiol. 2011;64:487–496.
4. Brodke DS, Goz V, Lawrence BD, Spiker WR, Neese A, Hung M. Oswestry Disability Index: a psychometric analysis with 1,610 patients. Spine J. 2017;17:321–327.
5. Brodke DS, Goz V, Voss MW, Lawrence BD, Spiker WR, Man H. PROMIS® PF CAT outperforms the ODI and SF-36 Physical Function domain in spine patients. Spine (Phila Pa 1976). 2016;42:921–929.
6. Carmont MR, Silbernagel KG, Nilsson-Helander K, Mei-Dan O, Karlsson J, Maffulli N. Cross cultural adaptation of the Achilles tendon Total Rupture Score with reliability, validity and responsiveness evaluation. Knee Surg Sports Traumatol. Arthrosc. 2013;21:1356–1360.
7. Cella D, Riley W, Stone A, Rothrock N, Reeve B, Yount S, Amtmann D, Bode R, Buysse D, Choi S. The Patient-Reported Outcomes Measurement Information System (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. J Clin Epidemiol. 2010;63:1179–1194.
8. Cook CE. Clinimetrics Corner: The minimal clinically important change score (MCID): a necessary pretense. J Man Manip Ther. 2008;16:E82–83.
9. Cornell CN, Levine D, O'Doherty J, Lyden J. Unipolar versus bipolar hemiarthroplasty for the treatment of femoral neck fractures in the elderly. Clin Orthop Relat Res. 1998;348:67–71.
10. Davidson M, Keating JL. A comparison of five low back disability questionnaires: reliability and responsiveness. Phys Ther. 2002;82:8–24.
11. Franchignoni F, Vercelli S, Giordano A, Sartorio F, Bravini E, Ferriero G. Minimal clinically important difference of the Disabilities of the Arm, Shoulder and Hand outcome measure (DASH) and its shortened version (QuickDASH). J Orthop Sports Phys Ther. 2014;44:30–39.
12. Fries JF, Witter J, Rose M, Cella D, Khanna D, Morgan-DeWitt E. Item response theory, computerized adaptive testing, and PROMIS: assessment of physical function. J Rheumatol. 2014;41:153–158.
13. Fritz JM, Irrgang JJ. A comparison of a modified Oswestry Low Back Pain Disability Questionnaire and the Quebec Back Pain Disability Scale. Phys Ther. 2001;81:776–788.
14. Gatchel RJ, Lurie JD, Mayer TG. Minimal clinically important difference. Spine (Phila Pa 1976). 2010;35:1739–1743.
15. Gershon RC, Rothrock N, Hanrahan R, Bass M, Cella D. The use of PROMIS and Assessment Center to deliver patient-reported outcome measures in clinical research. J Appl Meas. 2010;11:304–314.
16. Gregory J, Harwood D, Gochanour E, Sherman S, Romeo A. Clinical outcomes of revision biceps tenodesis. Int J Shoulder Surg. 2012;6:45–50.
17. Hagg O, Fritzell P, Nordwall A. The clinical importance of changes in outcome scores after treatment for chronic low back pain. Eur Spine J. 2003;12:12–20.
18. Hung M, Cheng C, Hon SD, Franklin JD, Lawrence BD, Neese A, Grover CB, Brodke DS. Challenging the norm: further psychometric investigation of the neck disability index. Spine J. 2015;15:2440–2445.
19. Hung M, Hon SD, Franklin JD, Kendall RW, Lawrence BD, Neese A, Cheng C, Brodke DS. Psychometric properties of the PROMIS physical function item bank in patients with spinal disorders. Spine (Phila Pa 1976). 2014;39:158–163.
20. Hung M, Zhang W, Chen W, Bounsanga J, Cheng C, Franklin JD, Crum AB, Voss MW, Hon SD. Patient-reported outcomes and total health care expenditure in prediction of patient satisfaction: results from a national study. JMIR Public Health Surveill. 2015;1:e13.
21. Ibrahim T, Beiri A, Azzabi M, Best AJ, Taylor GJ, Menon DK. Reliability and validity of the subjective component of the American Orthopaedic Foot and Ankle Society clinical rating scales. J Foot Ankle Surg. 2007;46:65–74.
22. Jaeschke R, Singer J, Guyatt GH. Measurement of health status. Ascertaining the minimal clinically important difference. Control Clin Trials. 1989;10:407–415.
23. Jones PW, Beeh KM, Chapman KR, Decramer M, Mahler DA, Wedzicha JA. Minimal clinically important differences in pharmacological trials. Am J Respir Crit Care Med. 2014;189:250–255.
24. Juniper EF, Guyatt GH, Feeny DH, Ferrie PJ, Griffith LE, Townsend M. Measuring quality of life in children with asthma. Qual Life Res. 1996;5:35–46.
25. Kotsis SV, Chung KC. Responsiveness of the Michigan Hand Outcomes Questionnaire and the Disabilities of the Arm, Shoulder and Hand questionnaire in carpal tunnel surgery. J Hand Surg Am. 2005;30:81–86.
26. Kottner J, Audige L, Brorson S, Donner A, Gajewski BJ, Hrobjartsson A, Roberts C, Shoukri M, Streiner DL. Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed. J Clin Epidemiol. 2011;64:96–106.
27. Kovacs FM, Abraira V, Royuela A, Corcoll J, Alegre L, Tomás M, Mir MA, Cano A, Muriel A, Zamora J, del Real MTG, Gestoso M, Mufraggi N; The Spanish Back Pain Research Network. Minimum detectable and minimal clinically important changes for pain in patients with nonspecific neck pain. BMC Musculoskelet Disord. 2008;9:43–43.
28. Landauer F, Wimmer C, Behensky H. Estimating the final outcome of brace treatment for idiopathic thoracic scoliosis at 6-month follow-up. Pediatr Rehabil. 2003;6:201–207.
29. Little DG, MacDonald D. The use of the percentage change in Oswestry Disability Index score as an outcome measure in lumbar spinal surgery. Spine (Phila Pa 1976). 1994;19:2139–2143.
30. MacDermid JC, Richards RS, Donner A, Bellamy N, Roth JH. Responsiveness of the Short Form-36, Disability of the Arm, Shoulder, and Hand questionnaire, patient-rated wrist evaluation, and physical impairment measurements in evaluating recovery after a distal radius fracture. J Hand Surg Am. 2000;25:330–340.
31. MacDermid JC, Walton DM, Avery S, Blanchard A, Etruw E, McAlpine C, Goldsmith CH. Measurement properties of the Neck Disability Index: a systematic review. J Orthop Sports Phys Ther. 2009;39:400–417.
32. Mannion AF, Junge A, Grob D, Dvorak J, Fairbank JC. Development of a German version of the Oswestry Disability Index. Part 2: sensitivity to change after spinal surgery. Eur Spine J. 2006;15:66–73.
33. McCormick JD, Werner BC, Shimer AL. Patient-reported outcome measures in spine surgery. J Am Acad Orthop Surg. 2013;21:99–107.
34. McGlothlin AE, Lewis RJ. Minimal clinically important difference: defining what really matters to patients. JAMA. 2014;312:1342–1343.
35. Ostelo RW, Deyo RA, Stratford P, Waddell G, Croft P, Von Korff M, Bouter LM, de Vet HC. Interpreting change scores for pain and functional status in low back pain: towards international consensus regarding minimal important change. Spine (Phila Pa 1976). 2008;33:90–94.
36. Paatelma M, Kilpikoski S, Simonen R, Heinonen A, Alen M, Videman T. Orthopaedic manual therapy, McKenzie method or advice only for low back pain in working adults: a randomized controlled trial with one year follow-up. J Rehabil Med. 2008;40:858–863.
37. Papuga MO, Mesfin A, Molinari R, Rubery PT. Correlation of PROMIS Physical Function and Pain CAT instruments with Oswestry Disability Index and Neck Disability Index in spine patients. Spine (Phila Pa 1976). 2016;41:1153–1159.
38. Polson K, Reid D, McNair PJ, Larmer P. Responsiveness, minimal importance difference and minimal detectable change scores of the shortened Disability Arm Shoulder Hand (QuickDASH) questionnaire. Man Ther. 2010;15:404–407.
39. Pool JJ, Ostelo RW, Hoving JL, Bouter LM, de Vet HC. Minimal clinically important change of the Neck Disability Index and the Numerical Rating Scale for patients with neck pain. Spine (Phila Pa 1976). 2007;32:3047–3051.
40. Reuben DB, Tinetti ME. Goal-oriented patient care–an alternative health outcomes paradigm. N Engl J Med. 2012;366:777–779.
41. Revicki D, Hays RD, Cella D, Sloan J. Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. J Clin Epidemiol. 2008;61:102–109.
42. Roland M, Fairbank J. The Roland–Morris disability questionnaire and the Oswestry disability questionnaire. Spine (Phila Pa 1976). 2000;25:3115–3124.
43. Segal NA, Glass NA, Teran-Yengle P, Singh B, Wallace RB, Yack HJ. Intensive gait training for older adults with symptomatic knee osteoarthritis. Am J Phys Med Rehabil. 2015;94:848–858.
44. Smith-Forbes EV, Howell DM, Willoughby J, Pitts DG, Uhl TL. Specificity of the minimal clinically important difference of the quick Disabilities of the Arm Shoulder and Hand (QDASH) for distal upper extremity conditions. J Hand Ther. 2016;29:81–88; quiz 88.
45. Terwee CB, Roorda LD, Dekker J, Bierma-Zeinstra SM, Peat G, Jordan KP, Croft P, de Vet HC. Mind the MIC: large variation among populations and methods. J Clin Epidemiol. 2010;63:524–534.
46. Thissen D, Liu Y, Magnus B, Quinn H, Gipson DS, Dampier C, Huang IC, Hinds PS, Selewski DT, Reeve BB, Gross HE, DeWalt DA. Estimating minimally important difference (MID) in PROMIS pediatric measures using the scale-judgment method. Qual Life Res. 2016;25:13–23.
47. Uchiyama S, Imaeda T, Toh S, Kusunose K, Sawaizumi T, Wada T, Okinaga S, Nishida J, Omokawa S. Comparison of responsiveness of the Japanese Society for Surgery of the Hand version of the carpal tunnel syndrome instrument to surgical treatment with DASH, SF-36, and physical findings. J Orthop Sci. 2007;12:249–253.
48. Wright A, Hannon J, Hegedus EJ, Kavchak AE. Clinimetrics corner: a closer look at the minimal clinically important difference (MCID). J Man Manip Ther. 2012;20:160–166.
49. Yost KJ, Eton DT, Garcia SF, Cella D. Minimally important differences were estimated for six Patient-Reported Outcomes Measurement Information System-Cancer scales in advanced-stage cancer patients. J Clin Epidemiol. 2011;64:507–516.
50. Young IA, Cleland JA, Michener LA, Brown C. Reliability, construct validity, and responsiveness of the Neck Disability Index, patient-specific functional scale, and numeric pain rating scale in patients with cervical radiculopathy. Am J Phys Med Rehabil. 2010;89:831–839.