One of the primary focuses of physical therapy in long-term care is to improve the functional mobility of a resident. Many times the resident is admitted at a very low functional level, unable to complete the most basic tasks such as bed mobility and transfers. Because of these generalized deficits in mobility, it is difficult to objectively measure functional mobility and to identify improvement with treatment because many of the current functional scales are not designed for use with residents of long-term care.
Functional scales that are typically used for residents of long-term care can be categorized by method of administration as a self-report instrument completed by the patient or as a performance-based measure requiring observation and rating of movement by a physical therapist or other health care professional. For example, self-report questionnaires that address mobility include the California Functional Evaluation instrument,1 the Movement Ability Measure,2,3 the Health Assessment Questionnaire,4,5 and the Functional Status Questionnaire.6 Although these questionnaires are easy to administer and appear to adequately address mobility, they are inherently subject to response bias. In addition, self-report questionnaires can be problematic in patient populations with a high incidence of cognitive impairment, as is commonly found in long-term care facilities. Sinoff and Ore7 report that self-report questionnaires are problematic when used with persons older than 75 years. They found inconsistency between self-report and actual performance of questionnaire tasks, suggesting that older adults may not accurately perceive their physical function. Brach et al8 suggest that instruments based on performance are more likely to identify deficits in physical function than questionnaires that are based on self-report.
Performance-based scales can be subclassified into those that test mobility skills alone (eg, Rivermead Mobility Index9 and Clinical Outcome Variables Scale10) or mobility and activities of daily living (eg, Edmonton Functional Assessment Tool,11,12,13 Barthel Index,14 Katz Index of Independence in Activities of Daily Living,15 and the Functional Independence Measure16) or are disease-specific in looking at functional mobility (eg, Motor Assessment Scale for persons with stroke17 and the Parkinson Activity Scale18). Other scales test specific aspects of mobility, such as balance and gait (eg, Berg Balance Scale19 and Performance Oriented Mobility Assessment20). Many of these performance-based scales include items that are unrelated to mobility (eg, continence or communication), items that would be inappropriate for the majority of long-term care facility residents (eg, running), or items specific to a disease process (eg, hand movements or gait akinesia). Because of this, many are not appropriate for the general long-term care facility population. In their place, physical therapists may use subjective ratings to evaluate the resident's functional ability. Although these subjective ratings are usually performance-based, they are not standardized and may reflect unwanted bias or excessive error in the rating.
Nitz and Hourigan21 and Barker et al22 reported on a scale, the Physical Mobility Scale (PMS), that was developed by physical therapists and seems to be an appropriate tool to evaluate the functional mobility of aging adults in long-term care. Nitz and Hourigan21 found the PMS to have good reliability in participants ranging in age from 35 to 90 years. Interrater reliability using intraclass correlational coefficients (ICC) for individual items ranged from 0.68 to 0.94 and was not affected by the physical therapists' level of experience. Intrarater reliability was also established with an ICC level of more than 0.9. The PMS demonstrated concurrent validity (Spearman's rank order agreement = 0.69 to 0.90) with the performance scoring outcomes of the Clinical Outcomes Variable Scale and the Rivermead Mobility Index. Barker et al22 also reported good interrater reliability (κ > .60 for most items) and evidence to support construct validity.
While the PMS seems to have good reliability and good support for validity for use with adults, the responsiveness of this performance-based scale has been reported in only 1 study.22 Responsiveness allows the clinician to make decisions about a change in a patient's outcome as detected by the scale. In addition, it allows for inference about the effectiveness of treatment, economic appraisals, and other program evaluations.23 Two types of responsiveness have been commonly used in the physical therapy literature, minimal detectable change (MDC) and minimal clinically important difference (MCID). Barker et al22 determined the MDC of the PMS to be 4.39 at the 90% confidence level. To our knowledge, no study has reported the MCID of the PMS.
The MDC is the minimal amount of change required to be considered a statistically significant change. The MDC allows inference about how much change has actually occurred beyond error of measurement of the scale. Although the MDC is an indication of statistically significant change, this change may not be clinically meaningful. Therefore, it is also important to establish the MCID. In contrast to the MDC, which is statistically determined, MCID is based on subjective ratings of change by the patient, caregiver, or health care provider. The purpose of the present study is to determine the responsiveness of the PMS based on the MDC and the MCID.
For our study, 70 participants (mean age = 81.4 years [SD = 6.3]; 12 women and 58 men) were recruited from a state veterans nursing facility. The most common diagnoses included hypertension (64.3%), dementia (42.9%), chronic obstructive pulmonary disease (28.6%), diabetes mellitus (27.1%), coronary artery disease (25.7%), and cerebrovascular accident (22.9%). Initial recruitment consisted of a verbal invitation to participate from the lead author to residents. Inclusion criteria were (1) ability to follow verbal instructions and (2) no medical contraindications to performing basic mobility tasks. Those who did not meet the inclusion criteria were excluded from the study. All participants provided informed consent under the University of Nevada, Las Vegas institutional review board approval prior to participation in the study.
Procedure and Data Collection
To determine the responsiveness of the PMS, participants were assessed by the same physical therapist on 2 separate occasions. The PMS includes measures of 9 basic movements, including supine to side-lying, supine to sitting, sitting balance, sitting to and from standing, standing balance, transferring, and ambulating (Appendix).21 Each of the 9 measures is scored on a scale of 0 to 5, with 0 being dependent and 5 independent. Total scores range from 0 to 45, with 45 indicating independent mobility functioning and 0 indicating very low mobility functioning.
The original PMS does not have formal instructions on how to implement the test or definitions of the items. No specific instructions in the original PMS regarding single limb balance time and wheelchair mobility distance were provided in either article on the PMS.21,22 In this study, some clarifications were made to ensure consistency and instructions were added to the scoring sheet. The clarifications that were made are italicized in the Appendix. The first 5 items and the item on transfers are well described in the scoring sheet and are self-explanatory. The standing balance item was clarified to state that the single limb balance must be maintained for 10 seconds to receive a score of 5. This follows the same guidelines as the Berg Balance Scale,19 in which the participant must maintain a single leg stand for 10 seconds to receive full marks for that item. Springer et al determined normative values of the single leg stand by decade as follows: 60-to 69-year-old participants could perform a single leg stand for a mean of 26.9 seconds, 70-to 79-year-old participants for 15.0 seconds, and 80-to 99-year-old participants for 6.2 seconds.24 Because the population with which we are concerned are in these ranges and are not considered healthy, the 10-second cutoff seemed reasonable. We also clarified the wheelchair mobile score of wheelchair mobility to be defined as able to move 50 ft without assistance, because that length is a reasonable distance to get to most immediate areas in a nursing facility (eg, room to dining room).
Calculation of Minimal Detectable Change
For calculation of the MDC, 70 participants were tested twice within the same week. The MDC is determined by performing a test and a retest within a relatively short time frame so that the condition being investigated is unlikely to have changed.25 To determine the MDC, one must first assign a reliability change index value. The reliability change index expresses the confidence level at which this change could be considered significant. For instance, if one were to measure at a 95% confidence interval, then change above this level would be confidently considered (at a 95% confidence level) greater than measurement error and, therefore, likely a true change.23 Once the reliability is determined, the Standard Error of Measurement (SEM) is found by the following equation23,25,26:
where r xx = test-retest reliability.
The MDC at a 95% confidence level (MDC95) for the individual is found by multiplying the SEM by 1.96 (representing 95% of the area under the curve of a normal distribution) and 1.41 (the square root of 2, to control for possible error associated with calculating the coefficient from 2 data sets (ie, test and retest))23:
Although the MDC95 for the individual is typically used as a statistical cutoff for change in individual patients or participants, the group MDC95 is typically used by researchers or clinicians to determine whether a statistically significant change has occurred in the mean of a group of patients or participants. To determine the MDC95 for a group, the MDC95 for the individual is divided by the square root of the number in the group:
Calculation of Minimal Clinically Important Difference
For calculation of the MCID, 60 of the 70 original participants were assessed twice approximately 3 months apart. Ten of the original participants were lost to follow-up because of declining to participate (n = 4), medical contraindications (n = 3), discharge from the facility (n = 2), and death (n = 1). During the 3-month period between tests, resident activities were variable on the basis of individual preferences and treatments; thus, no limitations were put on activity levels.
Minimal clinically important difference, which by definition is the smallest difference in a score of a measurement tool that the patient, caregiver, or health care provider perceives as beneficial, can be calculated from data of participants who have minimally improved or minimally worsened as ranked from a Likert scale. Prior to the final assessment, the therapist provided a rating of the patient's change (or lack thereof) in functional mobility since the initial assessment. The therapist's assessment was standardized by using the Clinical Global Impression-Global Improvement (CGI-I) scale,26 a typical 7-point Likert scale. The anchors for the CGI-I were 7 (very much worsened), 6 (much worsened), 5 (minimally worsened), 4 (no change), 3 (minimally improved), 2 (much improved), or 1 (very much improved). This scale has been used to determine MCID in previous studies.27,28,29,30 To determine whether there was a difference in the pre- and post-test scores for participants rated into each of the CGI-I anchors, paired-samples t tests were used.
Intrarater reliability for the pre- and post-PMS scores for all 70 participants was excellent (ICC [3,1] = 0.98). Using this reliability statistic, the MDC95 was calculated for the individual and group levels. At the individual level, the MDC95 was 3.98 points. At the group level, the MDC95 for the 70 participants was 0.48 points.
Results of the MCID are found in Table 1. While all of the differences in means for the pre- and post-PMS values trended in the correct direction based on their CGI-I anchor (Figure 1), low power (most likely small sample size) rendered some of these differences nonsignificant. Because there were not enough participants and power in the minimally improved and minimally worsened categories, these categories were combined with the much-improved and much-worsened categories, respectively (Table 2). Based on the combined categories, the MCID for improvement was 5 scale points, rounded up from 4.68 (95% confidence interval = 2.66–8.09), and for worsening, it was 4 scale points, rounded up from 3.82 (95% confidence interval = 0.68–6.95).
A tool that is able to accurately measure mobility change in long-term care facility residents would be an asset for physical therapists responsible to monitor resident function. It allows therapists to make inference about the resident's progress with treatment and helps guide clinical decision making about whether the implemented treatment has been successful. In addition, scales like the PMS can help determine when a long-term care facility resident may be in need of physical therapy. Many residents of long-term care facilities do not have regular physical therapy but do have regular, often biannual, evaluations to determine whether physical therapy or other treatments are needed or appropriate. A scale with scientifically validated responsiveness properties could be a valuable tool for a therapist doing these evaluations because it allows them to make sound evidence-based decisions on when a patient has worsened or improved.
Results from the present study suggest that the PMS is reliable and offers good value in determining change over time in aging adult residents living in a long-term care facility. A 4-point change in the PMS scale was determined to be the MDC at a 95% confidence interval on an individual level. The MDC at the individual level is the typical threshold used by clinicians in determining whether an individual patient has improved or worsened over time. Therefore, if a patient improves or worsens by 4 PMS scale points, under statistical parameters, health care providers can be confident that there has been true change in mobility status.
A change of at least 0.5 points was determined to be the MDC at the 95% confidence interval at the group level. Therefore, if a group of patients has realized a mean change of 0.5 PMS scale points, then researchers or health care providers can confidently conclude that this group of patients has had a statistically significant change in their mobility status. Although the MDC at the group level is not typically used by health care providers, it can be used to determine whether the mean of a group of participants with a similar diagnoses has changed significantly from a previous measurement. In the case of the PMS, it could be used to determine whether all patients with a similar profile (eg, dementia) at a long-term care facility were experiencing a significant change in their functional mobility from 1 year to the next by comparing mean difference over the 2 years.
Although only 3 of the 6 CGI-I anchors had results that were significant in the MCID portion of the study, these results suggest that the PMS is able to detect a meaningful change with very little score change (Table 1). An increase of 5 scale points was enough to show an improvement rated “improved” and a decrease of 4 points was determined to be “worsened.” On the basis of these data, in combination with the MDC95 value, it is safe to assume that a change of 5 scale points is both meaningful from a clinical perspective and statistically significant from a measurement error perspective. Therefore, the authors recommend that the conservatively estimated 5 scale point change, incorporating the 4 scale points from the MDC95 and the 5 scale points from the MCID, in either direction on the scale is important in determining change between the ranges of 5 and 40 scale points on the scale. These results are consistent with those of Barker et al,22 which found an MDC90 of 4.39. Because the scoring system on the PMS incrementally increases or decreases by whole numbers, this 4.39 would be appropriately rounded up to 5 scale points.
Because the PMS is performance-based, it affords a closer approximation to the actual functional mobility of patients than a self-report measure that is influenced by responder bias. The performance-based aspect of the PMS is not affected by limitations associated with cognitive dysfunction common in nursing facilities. The strength of a performance-based tool like the PMS is limited only by rater error and inherent variability of the subject and the tasks that the participant is performing. Because the intrarater reliability of the PMS in our study was high (ICC = 0.98), the amount of rater error was relatively small. While the PMS seems to have excellent reliability, based on our results and that of Nitz and Hourigan,21 more evidence is needed to support its validity. Therefore, future studies should move beyond reliability to aspects that would support its validity in this and other populations.
A challenge in determining MCID is that it has been shown to vary across patients and patient groups and, therefore, has limited generalizability to other populations.31 This is partially because patients are prone to bias and influenced by memory, emotional status, and cognitive ability. Using a performance-based, therapist-rated tool with clinical-based anchors, the results are less likely to be affected by patient subjectivity and bias and would, therefore, be more accurate and generalizable; however, there may also be some bias by the rater. Ferreira and Herbert32 discuss another possible weakness when attempting to interpret the MCID of an intervention; that is, the focus should be on whether the patient feels that the effect of or difference made by the intervention is sufficient enough to outweigh the costs, inconvenience, and harms of the intervention itself. Even though the MCID is typically based on the patient's perception of change, we feel that the therapist rating used in the present study was appropriate because of the high incidence of dementia (present in 42.9% of participants) as well as the absence of an intervention.
One limitation of this study was the underpowered clinically important change analyses. The most likely contributor to low power was the small number of participants in the “minimally improved” (4 participants) or “minimally worsened” (6 participants) categories (Table 1). Repeating the study with larger sample size would detect a more accurate value of the MCID. Because of the low power in these 2 categories, it was decided to combine them with the “much improved” and the “much worsened” categories. Although this is not ideal for analysis of MCID, it does provide meaningful information for clinicians in determining when patients have had a “clinically important difference.”
Another weakness was the female-to-male ratio of participants. Because this study was performed on residents living in a state veterans home, there were far more men than women. This may not be consistent with other long-term care facilities, in which the majority of residents are female.
The PMS demonstrated excellent reliability and had an MDC of 4 scale points for patients residing in a long-term care facility. The MDC of the PMS at the group level was determined to be 0.5 scale point change. It was also shown that an increase of 5 scale points in score was considered “improved” clinically, whereas a decrease of 4 points in score could be considered “worsened.”
1. Fung S, Byl N, Melnick M, et al. Functional outcomes: the development of a new instrument to monitor the effectiveness of physical therapy. Eur J Phys Med Rehabil. 1997; 7:31–41.
2. Allen DD. Responsiveness of the movement ability measure: a self-report instrument proposed for assessing the effectiveness of physical therapy intervention. Phys Ther. 2007; 87:917–924; discussion 925-934.
3. Allen DD. Validity and reliability of the movement ability measure: a self-report instrument proposed for assessing movement across diagnoses and ability levels. Phys Ther. 2007; 87:899–916; discussion 925-834.
4. Bruce B, Fries JF. The Health Assessment Questionnaire (HAQ). Clin Exp Rheumatol. 2005; 23:S14–S18.
5. Fries JF, Spitz PW, Young DY. The dimensions of health outcomes: the Health Assessment Questionnaire, disability and pain scales. J Rheumatol. 1982; 9:789–793.
6. Jette AM, Davies AR, Cleary PD, et al. The Functional Status Questionnaire: reliability and validity when used in primary care. J Gen Intern Med. 1986; 1:143–149.
7. Sinoff G, Ore L. The Barthel activities of daily living index: self-reporting versus actual performance in the old-old (> or = 75 years). J Am Geriatr Soc. 1997; 45:832–836.
8. Brach JS, VanSwearingen JM, Newman AB, Kriska AM. Identifying early decline of physical function in community-dwelling older women: performance-based and self-report measures. Phys Ther. 2002; 82:320–328.
9. Collen FM, Wade DT, Robb GF, Bradshaw CM. The Rivermead Mobility Index: a further development of the Rivermead Motor Assessment. Int Disabil Stud. 1991; 13:50–54.
10. Barker RN, Amsters DI, Kendall MD, Pershouse KJ, Haines TP. Reliability of the clinical outcome variables scale when administered via telephone to assess mobility in people with spinal cord injury. Arch Phys Med Rehabil. 2007; 88:632–637.
11. Kaasa T, Loomis J, Gillis K, Bruera E, Hanson J. The Edmonton Functional Assessment Tool: preliminary development and evaluation for use in palliative care. J Pain Symptom Manage. 1997; 13:10–19.
12. Kaasa T, Wessel J. The Edmonton Functional Assessment Tool: further development and validation for use in palliative care. J Palliat Care. 2001; 17:5–11.
13. Kaasa T, Wessel J, Darrah J, Bruera E. Inter-rater reliability of formally trained and self-trained raters using the Edmonton Functional Assessment Tool. Palliat Med. 2000; 14:509–517.
14. Mahoney FI, Barthel DW. Functional evaluation: the Barthel index. Md State Med J. 1965; 14:61–65.
15. Katz S, Downs TD, Cash HR, Grotz RC. Progress in development of the index of ADL. The Gerontologist. 1970; 10:20–30.
16. Linacre JM, Heinemann AW, Wright BD, Granger CV, Hamilton BB. The structure and stability of the Functional Independence Measure. Arch Phys Med Rehabil. 1994; 75:127–132.
17. Carr JH, Shepherd RB, Nordholm L, Lynne D. Investigation of a new motor assessment scale for stroke patients. Phys Ther. 1985; 65:175–180.
18. Nieuwboer A, De Weerdt W, Dom R, Bogaerts K, Nuyens G. Development of an activity scale for individuals with advanced Parkinson disease: reliability and “on-off” variability. Phys Ther. 2000; 80:1087–1096.
19. Berg K, Wood-Dauphinee S, Willliams JI, Gayton D. Measuring balance in the elderly: preliminary development of an instrument. Physiother Can. 1989; 41:304–311.
20. Tinetti ME. Performance-oriented assessment of mobility problems in elderly patients. J Am Geriatr Soc. 1986; 34:119–126.
21. Nitz J, Hourigan SR. Measuring mobility in frail older people: reliability and validity of the Physical Mobility Scale. Aust J Ageing. 2006; 25:31–35.
22. Barker AL, Nitz JC, Low Choy NL, Haines TP. Clinimetric evaluation of the Physical Mobility Scale supports clinicians and researchers in residential aged care. Arch Phys Med Rehabil. 2008; 89:2140–2145.
23. Beaton DE, Bombardier C, Katz JN, Wright JG. A taxonomy for responsiveness. J Clin Epidemiol. 2001; 54:1204–1217.
24. Springer BA, Marin R, Cyhan T, Roberts H, Gill NW. Normative values for the unipedal stance test with eyes open and closed. J Geriatr Phys Ther. 2007; 30:8–15.
25. Faber MJ, Bosscher RJ, van Wieringen PC. Clinimetric properties of the performance-oriented mobility assessment. Phys Ther. 2006; 86:944–954.
26. Schrag A, Sampaio C, Counsell N, Poewe W. Minimal clinically important change on the unified Parkinson's disease rating scale. Mov Disord. 2006; 21:1200–1207.
27. Lepola U, Wade A, Andersen HF. Do equivalent doses of escitalopram and citalopram have similar efficacy? A pooled analysis of two positive placebo-controlled studies in major depressive disorder. Int Clin Psychopharmacol. 2004; 19:149–155.
28. McRae AL, Brady KT, Mellman TA, et al. Comparison of nefazodone and sertraline for the treatment of posttraumatic stress disorder. Depress Anxiety. 2004; 19:190–196.
29. Lacasse Y, Wong E, Guyatt GH, King D, Cook DJ, Goldstein RS. Meta-analysis of respiratory rehabilitation in chronic obstructive pulmonary disease. Lancet. 1996; 348:1115–1119.
30. Peto V, Jenkinson C, Fitzpatrick R. Determining minimally important differences for the PDQ-39 Parkinson's Disease Questionnaire. Age Ageing. 2001; 30:299–302.
31. Haley SM, Fragala-Pinkham MA. Interpreting change scores of tests and measures used in physical therapy. Phys Ther. 2006; 86:735–743.
32. Ferreira ML, Herbert RD. What does “clinically important” really mean? The Aust J Physiother. 2008; 54:229–230.