Consider the following typical clinical scenario:
J.H. is a 72-year-old manadmitted to inpatient rehabilitation, with recent right middle cerebral artery stroke. As part of his initial evaluation, his physical therapist performed assessments of balance and gait speed. After 2 weeks in a rehabilitation setting, J.H. demonstrated an improvement in his balance scores and gait speed.
How does the physical therapist use this information? What clinical decisions will be influenced by the amount of observed change? How much change must occur for the patient to be considered “improved”—improved enough to be ready for discharge; improved enough to live independently?
Physical therapists use a variety of tools to make these kinds of decisions. In every case, some quantitative or qualitative value will be used to reflect the degree of change. The interpretation of that change can be influenced by which measurement tool has been selected. Balance, for example, can be assessed by using the Berg Balance Scale,1 the Timed Up and Go,2 or the Dynamic Gait Index.3 These give very different scores and reflect different constructs of balance. One or another balance assessment may be more appropriate for an individual based on his or her particular set of balance impairments. Therefore, the outcome of the patient's performance may be interpreted differently for each test.
To use clinical tools to detect treatment effects, we must be confident in the measure's responsiveness, so that the score is interpretable.4 Responsiveness is an important characteristic of validity that indicates how well an instrument measures change.5–8 A responsive instrument will detect small but relevant change, and the scores derived from the instrument should have meaning for clinicians, patients, and other interested parties. Traditionally, significant change and effectiveness of treatment have been measured by group comparisons by using the gold standard for the study of efficacy, the randomized controlled trial. Using statistical comparisons such as t tests or analysis of variance, these studies allow us to determine whether 1 group has responded differently from another. Statistical significance, however, can be influenced by many factors other than actual treatment, such as sample size and group variability. When comparing very large groups, even small changes over the course of treatment may be statistically significant, but that does not mean that these changes are clinically meaningful.9
Group comparisons are useful for generalizing change. Clinicians, however, are faced with the problem of properly interpreting the meaning of change scores as they apply to individual patients. Fortunately, the literature on the measurement characteristics of clinical examination measures and the interpretation of change scores continue to grow. Research reports on the psychometric properties of clinical instruments now often include measures of the tool's ability to detect change. In other words, how good is a clinical measure at detecting change when it actually occurs, and does this change reflect change that is meaningful to all parties involved?
The purpose of this article is to describe several measurement characteristics that will help therapists interpret change scores derived from their clinical assessments. We will discuss how physical therapists can interpret frequently reported measures of responsiveness and apply these values to individual patients. By better understanding the meaning of change scores, therapists will be able to communicate more effectively about treatment effects with patients, caregivers, other health care providers, and insurers. This is an essential element of translating evidence to practice. This article will also discuss some common practical issues encountered in the application of these principles, as well as suggested directions for future research.
Understanding Measurement Error
We take many types of measurements in clinical practice; these measurements are generally taken for 1 of the 3 reasons. The first is descriptive, to be able to describe a patient's status or condition. The second is discrimination, or the ability to distinguish differences among individuals. The third, perhaps most relevant to this discussion, is to assess change. We must be able to determine whether a patient is getting better or declining, making progress or plateauing.
Measurement theory states that all measurements are composed of some degree of error, either systematic or random.10 Systematic error is constant in amount and direction, and therefore, correction factors can be applied. For example, a scale may be incorrectly calibrated, consistently recording 5 pounds higher than the actual weight. This is an issue of validity, as it is consistent but inaccurate. Random error, however, is a matter of reliability, as it is not consistent and may be positive or negative, large or small.
There are 3 potential sources of random error (Table 1).10 For example, the instrument itself may be unreliable, such as mechanical faults in a force platform or imprecision in observational categories of a balance assessment. Error may be a function of the examiner, related to skill level or familiarity with the instrument. In many instances, random error is simply a function of inconsistent performance. For instance, a patient may repeat 3 trials of the Timed Up and Go test2 and may vary in speed by a few seconds each time. Such error is considered random in that it is not a true reflection of a difference in the patient's mobility or balance but just the chance variation from trial to trial.
For purposes of measuring change, therefore, we must be able to distinguish variations among trials due to random error and variations due to real change or progress. A reliable measure is the one that has minimal error, and therefore, we typically try to calibrate and check our equipment and train raters. But few measurements, if any, are totally reliable, even if errors are minimal. By knowing the degree of reliability associated with a measurement, we can better assess how useful the measures of change are.
One statistic that has been used to represent this degree of error is the standard error of measurement (SEM). This is an estimate of the expected variation in a set of stable scores, where we can assume that no real change has taken place. The SEM is calculated by:
where s is the standard deviation in a stable set of scores, and rxx represents a reliability index, typically test-retest. The smaller the standard deviation and the higher the reliability, the smaller is the SEM. A small SEM means that error is low, and observed differences in scores are more likely to be true change.10
Minimal Detectable Change
We can conceptualize change along a continuum, from the slightest to the largest amount of difference in scores from one trial to another (Figure 1). The minimal detectable change (MDC) is the smallest amount of change that can be considered above measurement error of the instrument.10,11 Since there will always be some degree of random error in a measurement, we would expect to see some variation even in a stable group of patients over repeated measures. The value expressed as the MDC is therefore the minimum amount of change that must be observed before one can be confident that real change has occurred and that the observed change score is not simply related to measurement error.
The most commonly used approach to calculate MDC is based on the SEM.10,11
Because the SEM is an estimate based on a given set of data, we determine the MDC based on the upper limit of a confidence interval. The value of z represents the degree of confidence for the estimate of MDC, typically 90% (z = 1.645) or 95% (z = 1.96). The √2 is applied in the formula as a correction factor to account for the uncertainty introduced by taking measurement at different points of time.11
To illustrate this application, Fulk and Echternach12 studied changes in gait speed poststroke. On the basis of a test-retest reliability of 0.862 and an SD of 0.345 for baseline scores, the SEM for this measure is 0.128. Therefore, they determined the MDC at 90% confidence interval.
Therefore, any changes in gait speed beyond 0.30 m/s could be considered real change. Changes less than this value would be considered a function of measurement error. Using a 95% confidence level, the MDC95 would be 0.35 m/s. We have a greater level of confidence in estimating the MDC95, but we do so at the loss of precision. To have greater confidence that real change has occurred, we will need to see a larger difference in scores.
Beyond the concept of minimal change, the measure of meaningful improvement is the minimal clinically important difference (MCID) (Figure 1), originally defined by Jaeschke et al13 as “the smallest difference in score in the domain of interest which patients perceive as beneficial and which would mandate, in the absence of troublesome side effects and excessive cost, a change in the patient's management”.(p408) Others have simply defined it as “the smallest difference in a score that is considered worthwhile or important.”9 (p419)
Two approaches have been used to calculate the MCID. Distribution-based approaches use group measures, such as group means and standard deviations, to estimate change. Statistical differences and effect size (ES) estimates are typically used for this approach. The most commonly used estimates are the ES indexand the standardized response mean (SRM), which are calculated a follows:
Both the ES and the SRM use the difference between means of trial 1 (M 1) and trial 2 (M 2) as the numerator. The ES index uses the standard deviation of the baseline (pretest) scores as the denominator. The SRM uses the standard deviation of the difference scores. Both estimates provide information on the magnitude of change in standardized units relative either to baseline variability or to variability in change scores.10
Glenny et al14 provided an example of the use of ES and SMR in their examination of the responsiveness of the functional independence measure (FIM) in patients in a geriatric unit of 2 rehabilitation hospitals. They reported an ES index of 1.68 and an SRM of 1.31 in the FIM motor scores. Effect size values have been interpreted as small (0.2), medium (0.5), and large (0.8).15 Therefore, these estimates of change with the FIM were considered extremely large, indicating that the tool was able to detect clinically relevant improvement in functional ability in older individuals who were participating in inpatient rehabilitation.14
While distribution-based methods are useful from a clinical research perspective, they do not provide estimates related to individual performance and therefore are less useful for decision making for individual patients. A second preferred approach to calculating the MCID is the anchor-based approach,6 which uses some external criterion as the reference point for calculating how much change needs to occur before that change would be considered important. The MCID values obtained by using anchor-based and distribution-based approaches may differ.16
An anchor is typically determined by directly asking interested parties (eg, the patient or caregivers) to rate the change that has taken place as a result of treatment.6,17,18 A Global Rating of Change (GROC) scale is often used for this purpose, relating change to perception of how much “better” or “worse” the patient's condition is (Figure 2).13,19
The MCID is a variable concept depending on the anchor used, particularly when there is no gold standard available. Anchors may also reflect various aspects of perceived change or accomplishment of important clinical outcomes. For example, a change in disability level, as measured on the Modified Rankin Scale (MRS),20 has been used as the criterion anchor for determining change on the FIM21,22 and for important change in gait speed after stroke.23 Wallace et al22 used the MRS as the anchor for change on the FIM, because a change in disability as measured by the MRS would presumably be closely related to change in assistance needed for functional activities measured by the FIM. Likewise, Tilson et al23 concluded that change in disability on the MRS is related, at least to some degree, to change in gait speed. Perception of degree of recovery after stroke has been used as the anchor for important change after constraint-induced movement therapy.24 Riddle et al25 have used the achievement of physical therapy treatment goals as an anchor for determining the MCID on the Roland-Morris Back Questionnaire. These anchor-based methods can be interpreted relative to a clearly understood clinical observation in individuals.7,18,26
Using different anchors as criteria may yield somewhat different results. For example, as previously mentioned, Wallace et al22 reported an MCID of 11 points for the motor subscore of the FIM by using a change in 1 level on the MRS20 as the anchor. By comparison, Beninato et al19 used physicians' opinions of change as the anchor and found the MCID to be 17 points. The difference in the MCID between these 2 studies may be due to the different anchors used. This highlights the value in using various anchors to obtain different perspectives on MCID scores.
Ideally, the anchor should be closely related to the construct being examined.27 For example, Fulk et al28 used a 15-point Global Rating Scale as the anchor to measure important change in gait speed and arrived at an MCID estimate of 0.175 m/s, with sensitivity (SN) of 0.81 and specificity (SP) of 0.71. The SN and SP allow us to better understand MCID as a cutoff to demarcate important change. The SN indicates how good a cutoff score (eg, 0.175 m/s change in gait speed in the case of Fulk et al28) is for classifying people as having achieved MCID if they make the cutoff score (SN or true positives) or classifying them as not having achieved MCID if they do not make the cutoff score (SP or true negatives). Fulk et al28 asked patients to rate the change in their walking ability (using the Global Rating Scale) since starting outpatient physical therapy. In contrast, Tilson et al23 used a change of 1 level on the MRS as the anchor for determining important change in gait speed after stroke. The MRS is not specific to walking ability, but disability level is presumably related to walking ability to some extent. This less-specific anchor yielded a similar MCID estimate of 0.16 m/s but with lower SN and SP (0.74 and 0.57, respectively). This loss in accuracy may reflect the weaker association between the anchor and the construct being measured.23
Various perspectives are needed to determine how much change is important and which factors should be used as an anchor. Many researchers have asked both the patient and the clinician for their perspectives on change.18,28,29 Recently, Wyrwich et al30 have suggested triangulating estimates from patient and clinician surveys, as well as expert consensus panels, to arrive at deeper understanding of the meaning of change to different groups. Patients and clinicians do not always agree on the rating of important change. For example, while Stratford et al18 found good agreement between the estimates of change by therapists and patients with back pain, there has not been good agreement between physician and patient estimates in other areas of medical care such as oncology,31 mental health,32 pulmonology,29 cardiology,30 and rheumatology.33 As we strive to deliver patient-centered care, we need to deepen our appreciation of the differences in perspective on change among patients, caregivers, and clinicians.
Consider the patient with stroke from our case example. If the Berg Balance Scale1 was being used, the MCID could be determined on the basis of clinician opinion—but would that agree with J.H.'s opinion of important change in his balance? Perhaps, from J.H.'s perspective, important change will not be reached until his balance is good enough to transfer or walk independently, whereas a therapist may see smaller incremental changes as relevant on the basis of past experience with such patients. Certainly, it is understandable that one's perspective on importance can vary on the basis of age. Someone in young adulthood would probably not value clinical changes from the same perspective as someone much older. By and large, we do not know the answer to these questions, and further research is clearly needed so that we can improve our patient-centered care approach to physical therapy based on patient values.
Typically, a score of 3 to 5 on a 15-point GROC scale is used to indicate a minimally important change.13,19 However, the GROC scales need further validation to better understand what kind of functional change is correlated with these subjective ratings and just how important is “important.” Remarkably, the Likert GROC scales used to survey patients and others on their opinions of change do not include the word “important” as any of the options (Figure 2). Traditionally, some point on the scale is accepted as the point beyond which important change has taken place even though importance associated with that cutoff score is assumed. Wolfe et al34 have argued that change needs to be interpreted relative to “desirable” versus “most desirable” clinical states and that treatment effects may not be clearly understood by considering only minimal change. They proposed the use of the “really important difference” as an indicator of true functional change. From a clinical perspective, it is essential to appreciate the relevance of the anchor on its interpretation. There is no direct evidence that a particular cutoff score on the Likert scale is actually associated with important change. It is necessary to investigate and validate other scales that include specific language, indicating what important change is and what it is not.
Interpretation of Reported Values for MDC and MCID
A few considerations should be addressed when applying reported values of MDC and MCID to individual patients.35 First, it is important to understand that any single reported value for MDC or MCID is simply an estimate based on a particular sample. By applying a confidence interval, we can obtain a better estimate of how close a score derived from a sample of patients is to the true value of the MCID for the population. A smaller confidence interval suggests that the reported value may be closer to the truth and more accurate than a value associated with a wide interval.
No estimate of important change can be completely accurate. There will always be some degree of error in the calculation related to variability among patient groups. Therefore, for any reported MCID value, some patients who score greater than the identified change score will not perceive that important change has taken place. The opposite is also true; some patients, who score very little change on the clinical measure, may perceive that important change has taken place. We can account for these misclassifications by considering the SN and SP associated with the MCID value.35
If SN and SP are high, we can be more confident that if our patient achieves the change score associated with the MCID, then he has most likely achieved important change. The related measure of likelihood ratios can also be used. Some researchers19,25,28 have calculated the positive likelihood ratio (LR+) and negative likelihood ratio (LR−)associated with MCID scores. A larger LR+ and a smaller LR− indicate the likelihood that important change has been achieved when a score equal to or greater than the MCID is obtained on the clinical measure. For example, Beninato et al19 showed that for patients poststroke who had a change in total FIM score of at least 22 points after rehabilitation, the LR+ = 3.3 and the LR− = 0.31. A LR+ is the likelihood of having the condition (achieving MCID) with a positive test (at least 22 point change on the FIM). The LR+ is a ratio calculated as:
A LR+ greater than 10.0 and a LR− less than 0.20 are considered strong.10 Therefore, this study showed moderate likelihood ratios, indicating that the MCID provided only a fair estimate of the true minimal important change. Taken together, these statistical tools help us understand the accuracy of the MCID as we apply them to our individual patients.
The patient example given at the start of this article offers an opportunity to better understand how the MDC and MCID can be applied and interpreted.
J.H. is a 72-year-old man admitted to inpatient rehabilitation with recent right middle cerebral artery stroke. As part of standard initial assessments, the physical therapist administered the Berg Balance Scale; patient achieved a score of 35 of 56. At discharge after 2 weeks of inpatient rehabilitation, he had improved 10 points to a score of 45 of 56. His gait speed, as measured by the 10-m walk test, was initially 0.21 m/s. At discharge, gait speed was 0.56 m/s—an improvement of 0.35 m/s. His total FIM score on admission was 68 of 126 and improved to 92 of 126 at discharge—a change of 24 points. After discharge to home from inpatient rehabilitation, J.H. continued therapy as an outpatient. The outpatient therapist used the Activities-Specific Balance Confidence (ABC) scale36 and 10-m walking speed as outcome measures. From initial examination to discharge, the patient improved 16 points on the ABC scale from 51 of 100 to 67 of 100, and 0.16 m/s from 0.58 to 0.74 m/s on walking speed.
Consider first how the available literature could aid in the interpretation of the change scores observed during inpatient rehabilitation. On the FIM, J.H. improved by 24 points, which is greater than the known value of 22 points for the MCID,19 so we can be sure that there has been a meaningful improvement as a result of rehabilitation. The total FIM tells us about his overall function, but other measures provide more specific information. How should we interpret the change scores on the Berg Balance Scale? J.H. improved on the Berg by 10 points during inpatient rehabilitation. Stevenson37 has shown that the MDC for the Berg is 7 points for people with stroke after inpatient rehabilitation. Using this value, we are confident that the observed change in J.H. is not due to measurement error; it reflects a real improvement in his balance. With regard to changes in gait speed, the observed improvement of 0.35 m/s exceeds the MDC for gait speed of 0.30 m/s, reported by Fulk and Echternach,12 after inpatient stroke rehabilitation. We do not know, however, if these improvements in balance and gait speed reflect important functional change. Unfortunately, the literature does not provide a value for MCID for these measures in the setting of inpatient rehabilitation, so we are unable to determine with any certainty that his improved balance or gait speed represents important change.
Next, consider the changes observed during outpatient rehabilitation. In terms of the change in gait speed, J.H. improved by 0.16 m/s, but can we make a claim about real and important change in this circumstance? Hill et al38 found an MDC of 0.08 to 0.16 m/s for 22 patients in a rehabilitation program at an average of 11.1 weeks after a stroke. Fulk and Echternach12 found an MCID of 0.175 m/s in an outpatient setting. On the basis of these numbers, we can assume that J.H. has improved at least beyond the degree of measurement error, but we cannot assume that this is an important difference. From a clinical decision-making standpoint, we can see that the interventions aimed at improving his gait speed are working but we have a ways to go. Perhaps, these findings would cause us to consider increasing the frequency of his therapy or the intensity of training to improve his gait speed up to the threshold for important change. These findings could be used to justify such decisions.
It is clear that there are differences in the values for MDC and MCID, depending on the setting and sample population used to determine the score. This may reflect variation based on initial level of impairment. For example, for the Berg Balance Scale, Donoghue et al39 and Stevenson,37 both demonstrated that the MDC for individuals who were independent in ambulation was smaller than for those who were dependent. Similarly, for the FIM after stroke rehabilitation, Beninato et al19 demonstrated that for patients with stroke admitted at lower functional levels, larger changes on the FIM were required before change was considered to have reached MCID compared with patients who were initially less impaired. More highly functioning patients may require less change, in part because of ceiling effects associated with the outcome measure.18,19 In the case of gait speed, changes that were calculated on the basis of a more dependent sample of inpatients with stroke may need to be interpreted differently from independent, community-dwelling persons.
For the ABC scale, we know that the MDC ranges from 6% to 15% in individuals with Parkinson disease,40 lower limb amputation,41 and older adults living in a personal care home.42 But the MDC has not been calculated for people with stroke. Therefore, while it appears that J.H. has made important improvements in his balance confidence with his 16-point change on the ABC, we cannot state this with certainty. It should not be assumed that the MDC or MCID is the same across different diagnostic categories or at different levels of function within the same diagnostic group. For example, at the 95% confidence level, the MDC for the Berg Balance Scale has been reported as 7 points in people with stroke undergoing rehabilitation37 compared with 5 points in people with Parkinson disease living in the community40 and 8 points for older adults in residential care.43 People with different diagnoses may be more variable as a group due to factors like medication response, clusters of impairments associated with a diagnosis, or the relative stability or instability of the condition. When we are unable to find the MDC or MCID value for a given clinical tool and a specific patient population, we may use values obtained on different groups as a guide. We must use caution, however, when making such applications.
Improvement Versus Decline
Sometimes literature does not report a value for improvement on a certain clinical measure, but it does report a value for important decline. We cannot assume that the change score related to improvement is the same as decline. For example, Cella et al,44 when studying the quality of life in patients with cancer, found that change scores associated with worsening quality of life were considerably larger than change scores associated with important improvement. They suggested that this may be because patients minimize personal assessments of decline.44 Is this always the case? It appears not to be true in the area of gait speed. Recently, Perera et al45 studied important decline in gait speed in a group of community-based older adults with subacute stroke. They reported small but meaningful decline in gait speed as 0.05 m/s and substantial decline as 0.10 m/s. This value for substantial decline in gait speed is near but often less than the value that others found for minimal improvement in gait speed after hip fracture (0.10 m/s)46 and after stroke (0.16 m/s23 and 0.175 m/s28). It appears that in the case of gait speed, small declines may induce a larger perceived negative effect on function compared with the same magnitude of improvement. If the purpose is to track expected decline due to aging or through the course of a degenerative disease, then remaining stable or small declines in outcome measures may be important. This approach of tracking disease course using patient-centered questionnaires has been used with the Parkinson's Disease Questionnaireby Peto et al.47
Directions for Future Research
While there is an increased focus on reporting the MDC and MCID for clinical measures, there remains a need to determine these values for many instruments commonly used in the clinic across different diagnostic categories and according to initial level of impairment. These studies will require relatively simple data collection and are a very accessible opportunity for practicing clinicians to participate in research. Data collection for the determination of anchor-based MCID simply involves surveying patients at the end of their episode of care. We would encourage physical therapists in practice to consider partnering with researchers to continue work in this important area. Collaborations such as these can greatly expand the body of literature in this area.
We can also consider how MCID relates to change across the enablement/disablement. For example, does important change measured in the activity domain of the International Classification of Functioning, Disability and Health48 relate to change in the participation domain? Likewise, how do important changes in impairment in the body/structure domain relate to changes in activity? For example, in stroke, one could explore how often a change on the Fugl Meyer49 is associated with achieving important change in activity, as measured on the Barthel Index.50 Researchers are just starting to investigate these relationships.
Reporting patient progress relative to MDC and MCID values will enhance the interpretability and meaningfulness of change scores derived from outcome measures. These values can be used to assess progress of individual patients and to illustrate to patients, care givers, and third-party payers that real (MDC) and important (MCID) change has taken place as a result of treatment. The interpretation of commonly used clinical measures based on the MDC and MCID can inform our clinical decision making and should be a guide for planning patient management. While literature reporting the MDC and MCID for many outcome measures is available, we encourage collaboration between clinicians and researchers to continue the work in this area and to include reports on larger cohorts, so that patients can be stratified by initial level of impairment and across different diagnostic categories. Further research needs to address the issues of how we are surveying the opinions of important change and identifying the components of important change across the enablement/disablement spectrum.
1. Berg KO, Wood-Dauphinee S, Williams JI, Gayton D. Measuring balance in the elderly: preliminary development of an intrument. Physiother Can. 1989;41:304–311.
2. Podsiadlo D, Richardson S. The Timed “Up & Go”: a test of basic functional mobility for frail elderly persons. J Am Geriatr Soc. 1991;39:142–148.
3. Wrisley DM, Walker ML, Echternach JL, Strasnick B. Reliability of the dynamic gait index in people with vestibular disorders. Arch Phys Med Rehabil. 2003;84:1528–1533.
4. Guyatt GH, Osoba D, Wu AW, Wyrwich KW, Norman GR. Clinical Significance Consensus Meeting Groups. Methods to explain the clinical significance of health status measures. Mayo Clin Proc. 2002;77:371–383.
5. Beaton DE. Understanding the relevance of measured change through studies of responsiveness. Spine. 2000;25:3192–3199.
6. Beaton DE, Bombardier C, Katz JN, et al. Looking for important change/differences in studies of responsiveness. OMERACT MCID working group. Outcome measures in rheumatology: minimal clinically important difference. J Rheumatol. 2001;28:400–405.
7. Stratford PW, Binkley FM, Riddle DL. Health status measures: strategies and analytic methods for assessing change scores. Phys Ther. 1996;76:1109–1123.
8. Liang MH, Lew RA, Stucki G, Fortin PR, Daltroy L. Measuring clinically important changes with patient-oriented questionnaires. Med Care. 2002;40(4)(suppl):45–51.
9. Hays RD, Woolley JM. The concept of clinically meaningful difference in health-related quality-of-life research. how meaningful is it? Pharmacoeconomics. 2000;18:419–423.
10. Portney LG, Watkins MP. Foundations of Clinical Research. Applications to Practice. 3rd ed. Upper Saddle River, NJ: Pearson Education Inc; 2009.
11. Haley SM, Fragala-Pinkham MA. Interpreting change scores of tests and measures used in physical therapy. Phys Ther. 2006;86:735–743.
12. Fulk GD, Echternach JL. Test-retest reliability and minimal detectable change of gait speed in individuals undergoing rehabilitation after stroke. JNPT. 2008;32:8–13.
13. Jaeschke R, Singer J, Guyatt GH. Measurement of health status. Ascertaining the minimal clinically important difference. Control Clin Trials. 1989;10:407–415.
14. Glenny C, Stolee P, Husted J, Thompson M, Berg K. Comparison of the responsiveness of the FIM and the interRAI post acute care assessment instrument in rehabilitation of older adults. Arch Phys Med Rehabil. 2010;91:1038–1043.
15. Husted JA, Cook RJ, Farewell VT, Gladman DD. Methods for assessing responsiveness: a critical review and recommendations. J Clin Epidemiol. 2000;53:459–468.
16. de Vet HC, Terwee CB, Ostelo RW, Beckerman H, Knol DL, Bouter LM. Minimal changes in health status questionnaires: distinction between minimally detectable change and minimally important change. Health Qual Life Outcomes. 2006;4:54.
17. Deyo RA, Centor RM. Assessing the responsiveness of functional scales to clinical change: an analogy to diagnostic test performance. J Chronic Dis. 1986;39:897–906.
18. Stratford PW, Binkley JM, Riddle DL, Guyatt GH. Sensitivity to change of the Roland-Morris back pain questionnaire: Part 1 [see comment]. Phys Ther. 1998;78:1186–1196.
19. Beninato M, Gill-Body KM, Salles S, Stark PC, Black-Schaffer RM, Stein J. Determination of the minimal clinically important difference in the FIM instrument in patients with stroke. Arch Phys Med Rehabil. 2006;87:32–39.
20. van Swieten JC, Koudstaal PJ, Visser MC, Schouten HJ, van Gijn J. Interobserver agreement for the assessment of handicap in stroke patients. Stroke. 1988;19:604–607.
21. Granger CV, Hamilton BB, Keith RA, Zielezny MSFS. Advances in functional assessment for medical rehabilitation. Top Geriatr Rehabil. 1986;1:59–74.
22. Wallace D, Duncan PW, Lai SM. Comparison of the responsiveness of the Barthel Index and the motor component of the functional independence measure in stroke: the impact of using different methods for measuring responsiveness. J Clin Epidemiol. 2002;55:922–928.
23. Tilson JK, Sullivan KJ, Cen SY, et al.; for Locomotor Experience Applied Post Stroke (LEAPS) Investigative Team. Meaningful gait speed improvement during the first 60 days poststroke: minimal clinically important difference. Phys Ther. 2010;90:196–208.
24. Fritz SL, George SZ, Wolf SL, Light KE. Participant perception of recovery as criterion to establish importance of improvement for constraint-induced movement therapy outcome measures: a preliminary study. Phys Ther. 2007;87:170–178.
25. Riddle DL, Stratford PW, Binkley JM. Sensitivity to change of the Roland-Morris Back Pain questionnaire: Part 2 [see comment]. Phys Ther. 1998;78:1197–1207.
26. Turner D, Schunemann HJ, Griffith LE, et al. The minimal detectable change cannot reliably replace the minimal important difference. J Clin Epidemiol. 2010;63:28–36.
27. Revicki D, Hays RD, Cella D, Sloan J. Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. J Clin Epidemiol. 2008;61:102–109.
28. Fulk GD, Lugwig M, Dunning K, Golden S, Boyne P, West T. Estimating important change in gait speed in people with stroke undergoing out-patient rehabilitation. J Neurol Phys Ther. 2011;35:82–89.
29. Wyrwich KW, Metz SM, Kroenke K, Tierney WM, Babu AN, Wolinsky FD. Measuring patient and clinician perspectives to evaluate change in health-related quality of life among patients with chronic obstructive pulmonary disease. J Gen Intern Med. 2007;22:161–170.
30. Wyrwich KW, Metz SM, Kroenke K, Tierney WM, Babu AN, Wolinsky FD. Triangulating patient and clinician perspectives on clinically important differences in health-related quality of life among patients with heart disease. Health Serv Res. 2007;42:2257–2274.
31. Slevin ML, Plant H, Lynch D, Drinkwater J, Gregory WM. Who should measure quality of life, the doctor or the patient? Br J Cancer. 1988;57:109–112.
32. Kwoh CK, O'Connor GT, Regan-Smith MG, et al. Concordance between clinician and patient assessment of physical and mental health status. J Rheumatol. 1992;19:1031–1037.
33. Kwoh CK, Ibrahim SA. Rheumatology patient and physician concordance with respect to important health and symptom status outcomes. Arthritis Rheum. 2001;45:372–377.
34. Wolfe F, Michaud K, Strand V. Expanding the definition of clinical differences: from minimally clinically important differences to really important differences. analyses in 8931 patients with rheumatoid arthritis. J Rheumatol. 2005;32:583–589.
35. de Vet HC, Terluin B, Knol DL, et al. Three ways to quantify uncertainty in individually applied “minimally important change” values. J Clin Epidemiol. 2010;63:37–45.
36. Powell LE, Myers AM. The Activities-Specific Balance Confidence (ABC) scale. J Gerontol A Biol Sci Med Sci. 1995;50A:M28-M34.
37. Stevenson TJ. Detecting change in patients with stroke using the Berg balance scale. Aust J Physiother. 2001;47:29–38.
38. Hill KD, Goldie PA, Baker PA, Greenwood KM. Retest reliability of the temporal and distance characteristics of hemiplegic gait using a footswitch system. Arch Phys Med Rehabil. 1994;75:577–583.
39. Donoghue D. Physiotherapy Research and Older People (PROP) group, Stokes EK. How much change is true change? The minimum detectable change of the Berg balance scale in elderly people. J Rehabil Med. 2009;41:343–346.
40. Steffen T, Seney M. Test-retest reliability and minimal detectable change on balance and ambulation tests, the 36-item short-form health survey, and the unified Parkinson disease rating scale in people with Parkinsonism. Phys Ther. 2008;88:733–746.
41. Miller WC, Deathe AB, Speechley M. Psychometric properties of the Activities-Specific Balance Confidence scale among individuals with a lower-limb amputation. Arch Phys Med Rehabil. 2003;84:656–661.
42. Holbein-Jenny MA, Billek-Sawhney B, Beckman E, Smith T. Balance in personal care home residents: a comparison of the Berg Balance Scale, the Multi-Directional Reach test, and the Activities-Specific Balance Confidence scale. J Geriatr Phys Ther. 2005;28:48–53.
43. Conradsson M, Lundin-Olsson L, Lindelof N, et al. Berg Balance Scale: Intrarater test-retest reliability among older people dependent in activities of daily living and living in residential care facilities. Phys Ther. 2007;87:1155–1163.
44. Cella D, Hahn EA, Dineen K. Meaningful change in cancer-specific quality of life scores: Differences between improvement and worsening. Qual Life Res. 2002;11:207–221.
45. Perera S, Mody SH, Woodman RC, Studenski SA. Meaningful change and responsiveness in common physical performance measures in older adults. J Am Geriatr Soc. 2006;54:743–749.
46. Palombaro KM, Craik RL, Mangione KK, Tomlinson JD. Determining meaningful changes in gait speed after hip fracture. Phys Ther. 2006;86:809–816.
47. Peto V, Jenkinson C, Fitzpatrick R. Determining minimally important differences for the PDQ-39 Parkinson's Disease Questionnaire. Age Ageing. 2001;30:299–302.
48. World Health Organization. International Classification of Functioning, Disability and Health: ICF. Geneva, Switzerland: World Health Organization; 2001.
49. Fugl-Meyer AR, Jaasko L, Leyman I, Olsson S, Steglind S. The post-stroke hemiplegic patient. 1. A method for evaluation of physical performance. Scand J Rehabil Med. 1975;7:13–31.
50. Mahoney FI, Barthel DW. Functional evaluation: the Barthel Index. Md State Med J. 1965;14:61–65.