Reliability, Validity, and Minimal Detectable Change of Balance Evaluation Systems Test and Its Short Versions in Older Cancer Survivors: A Pilot Study : Journal of Geriatric Physical Therapy

Secondary Logo

Journal Logo

Research Reports

Reliability, Validity, and Minimal Detectable Change of Balance Evaluation Systems Test and Its Short Versions in Older Cancer Survivors

A Pilot Study

Huang, Min H. PT, PhD, NCS; Miller, Kara PT, DPT; Smith, Kristin PT, DPT; Fredrickson, Kayle PT, DPT; Shilling, Tracy BS

Author Information
Journal of Geriatric Physical Therapy 39(2):p 58-63, April/June 2016. | DOI: 10.1519/JPT.0000000000000047
  • Free



A cancer survivor refers to a person “from the time of diagnosis and for the balance of life.”1 About 77% of new cancers are diagnosed in persons aged 55 years and older.2 The 5-year survival rate is currently estimated at 68%.2 By the year 2022, the number of cancer survivors will likely reach 18 million.3 Recently, rehabilitation to address the functional problems for this rapidly growing population has gained more attention.4

Cancer and its treatment can cause adverse effects on body systems underlying postural control. Fatigue,5,6 pain,5,6 muscle weakness,5–7 vestibular and visual deficits,8 cognitive change,6,9 and chemotherapy-induced peripheral neuropathy7,10 are common sequelae associated with cancer. Problems in balance and walking often develop during or immediately after cancer treatments.8,11,12 Long-term deficits in balance that persist years after the cancer diagnosis have also been reported.13 In older cancer survivors, balance impairments have been linked to the increased risks of falls,14 distress in managing daily activities,15 and poor quality of life.15,16 In older adults, the odds ratio for falls in those with a history of cancer was estimated to be 1.16 (95% confidence interval [CI] = 1.02-1.33).17 The assessment of balance can assist in the identification of older cancer survivors in need of interventions. However, no study has examined the psychometric properties of clinical balance assessment tools in older cancer survivors.

Balance control is a complex process. Various biomechanical and neurophysiologic mechanisms contribute to different aspects of balance control.18 As such, balance problems can vary depending on the underlying pathology. For example, a patient with peripheral vestibular disorder may have impaired sensory orientation but demonstrate normal reactive postural responses to unexpected perturbations.19 The Balance Evaluation Systems Test (BESTest) is the first balance assessment tool that was designed to locate the impairments responsible for balance problems.19 The BESTest categories test items into different domains of balance control, including biomechanical constraints, stability limits, anticipatory postural adjustments, reactive postural responses, sensory orientation, and stability in gait.19 Therefore, the assessment results can direct the interventions to focus on the identified deficits.19 Because the side effects of cancer and its treatment are diverse,5,6 it is imperative to use a tool that can detect the specific impairments within various domains of balance control. Two short versions of the BESTest, Mini-Balance Evaluation Systems Test (Mini-BESTest)20 and Brief Balance Evaluation Systems Test (Brief-BESTest),21 have also been developed. The psychometric properties of these tests have been established in patients with various balance problems22 and neurologic diagnoses.19–21,23,24

The purpose of this study was to examine the reliability, validity, and minimal detectable change (MDC) of the BESTest, Mini-BESTest, and Brief-BESTest in community-living older adults with a history of breast or prostate cancer, the most prevalent cancer diagnoses in women and men, respectively.4



This study was a cross-sectional design at a research laboratory setting. Twenty breast cancer and 8 prostate cancer survivors with a mean age of 68.4 years (SD = 8.13) living in the community volunteered to participate. The participants were recruited from local cancer centers, support groups, and health fair meetings through flyers, advertisement, and word of mouth. The inclusion criteria included age 55 years or more, a medically confirmed new breast or prostate cancer diagnosis (no recurrence), completion of the primary cancer treatment (chemotherapy, surgery, and radiation) for at least 3 months, and ability to walk 50 feet with or without an assistive device. The exclusion criteria included cancer recurrence or metastases, a history of neurologic conditions or more than 1 cancer diagnosis, acute illness, and unstable medical conditions. In addition, impaired cognition and low contrast vision less than 20/60 on the Snellen chart were chosen as the exclusion criteria because they are related to postural control.8,25,26 The University of Michigan-Flint Institutional Review Board approved all study procedures. Participants gave their written consent.

Sample Size

The sample size was estimated before the recruitment of participants on the basis of a power of 0.8 and an alpha level of 0.05 (2-tailed). For the reliability analysis, the sample size was estimated using the following parameters27: 2 trials of testing by the same rater for the test-retest reliability, 2 observations of each participant by 2 different raters for the interrater reliability, a minimally acceptable ICC value of 0.70, and an expected ICC value of 0.90. An ICC value of 0.70 has been suggested as the minimally acceptable value for reliability analysis.28 The expected ICC value of 0.90 represents good reliability in clinical outcome measures.29 At least 19 participants would be required for the analysis of test-retest and interrater reliability. For concurrent validity, a large effect size (ρ = 0.5) was chosen to estimate the sample size using G*Power 3.1.30 At least 26 participants would be required for the analysis of concurrent validity. Thirty participants were recruited to increase the power and 28 completed the study. Data from 2 participants who did not complete the study were excluded from the analysis. For the analysis of interrater reliability, 21 participants were randomly selected to be evaluated by 2 raters.


All investigators received training and followed the standardized protocols for testing. An investigator first screened potential participants for their eligibility to enroll in the study based on the inclusion and exclusion criteria. Mini-Cog31 was used to identify and exclude people with dementia. Mini-Cog includes a 3-item recall and a clock drawing test.31 Its sensitivity (76%) and specificity (89%) in detecting people with dementia have been examined in older adults living in the community.31,32 The accuracy of Mini-Cog in detecting dementia was not influenced by the level of education.31 Low contrast vision was examined at 10% contrast sensitivity level and a viewing distance of 40 cm (10% SLOAN Low Contrast Vision Chart, Prevision Vision). People with low contrast vision less than 20/60 on the Snellen chart were excluded.

The same investigator also collected the demographics, health information, and Functional Comorbidity Index33 through the interview and the review of medical documents provided by the participants. Functional Comorbidity Index is a self-report measure that assesses the impact of 18 diseases and conditions on a person's physical function.33 A score of 18 corresponds to the highest number of comorbid conditions, whereas a score of 0 represents no comorbid condition.33 The participants' plantar tactile sensation was examined at the great toe, 3rd and 5th metatarsals of each foot using a 5.07/10g Semmes-Weinstein Monofilament.34 Usual gait speed was determined using the 4-meter walk test.35

After the interview and tests for sensation and gait speed, the other investigators administered the BESTest to minimize bias. The participants rested as needed throughout the session. To determine the test-retest reliability, participants were tested twice about 1-2 weeks apart by the same rater. During the retest, the rater was blinded to the participants' previous scores. The live (not observed from a video recording) interrater reliability was determined using 2 raters. One rater had more than 20 years of clinical experience and was a neurologic clinical specialist certified by the American Board of Physical Therapy Specialties. The other rater was a student in the doctor of physical therapy program. A primary rater administered the test. The 2 raters independently and concurrently scored the performance of the participants. Each rater recorded the ratings separately on the scoring sheet. No discussion among the raters was allowed throughout the testing. The scores of the BESTest items were recorded on a prestructured worksheet that allowed the conversion of the BESTest item scores into the Mini-BESTest item scores.20 After the end of the test session, a rater calculated the scores of the Mini-BESTest. The scores of the Brief BESTest were obtained by extracting the relevant item scores from the BESTest.21 Following the BESTest during the first session, participants completed the Activities-specific Balance Confidence (ABC) Scale.36

Outcome Measures

Balance Evaluation Systems Test

The BESTest contains 36 items to assess impairments in 6 categories: (1) biomechanical constraints (eg, ankle strength), (2) stability limits (eg, functional reach), (3) anticipatory postural adjustments (eg, stand on one leg), (4) reactive postural responses (eg, lateral compensatory stepping), (5) sensory orientation (eg, eyes open, stand on foam), and (6) stability in gait (eg, Timed “Up & Go”).19 Each item was scored on a 0- to 3-point scale, with higher score indicating better balance. The scores from all items were summed to obtain the BESTest total score. The BESTest total score as a percentage of 108 maximum points possible was the percent total score.

Mini-Balance Evaluation Systems Test

The Mini-Balance Evaluation Systems Test (Mini-BESTest) includes 14 items of dynamic balance tasks from the BESTest.20 Each item was scored from 0 to 2 points. The maximum point possible was 28, with higher scores signifying better balance. The sum of the scores of all items was the Mini-BESTest total score. The total score as a percentage of 28 points possible was the percent total score.

Brief Balance Evaluation Systems Test

The Brief-BESTest has 6 items from the BESTest, including hip/trunk lateral strength, functional reach forward, stand on one leg (right/left), lateral compensatory stepping (right/left), stance with eyes closed on foam, and Timed “Up & Go” Test.21 Each item was rated on a 0- to 3-point scale using the original scoring methods of the BESTest. The maximum points possible were 24. The sum of scores from all items was the total score. The percent total score was the total score as a percentage of 24 maximum points possible.

Activities-specific Balance Confidence Scale

The ABC Scale is a reliable and valid measure of balance confidence commonly used in community-dwelling older adults.36 Participants rated their confidence in balance during 16 activities on a scale from 0% (no confidence) to 100% (complete confidence). The average score of all items was the score of the ABC Scale.36

Statistical Analysis

Statistical analysis was performed using IBM SPSS Statistics Version 21 (IBM Corporation, Armonk, NY). The distributions of data were assessed using the Shapiro-Wilk test. Descriptive statistics of the demographics and health information were calculated.


Relative reliability, including test-retest and interrater reliability of the BESTest, Mini-BESTest, and Brief-BESTest, was analyzed using the intraclass correlation coefficient (ICC2,1) with absolute agreement, single measure, and 95% CI.29 For clinical outcome measures, an ICC value greater than 0.75 indicates good reliability and the ICC value greater than 0.90 is recommended.29 To assess the trial-to-trial consistency of scores for an individual, the absolute reliability of the BESTest, Mini-BESTest, and Brief-BESTest was assessed by the standard error of measurement (SEM) using the equation29: SEM = SD × √(1 − ICC), where SD is the pooled standard deviation of the first and second measurements and ICC is the correlation coefficient from the test-retest and interrater reliability analysis. The SEM reflects the stability of the response when an outcome measure is repeated. For a reliable measurement, the SEM would be smaller.29

Minimal Detectable Change

To facilitate the interpretation of changes in test scores, the MDC at 95% CI (MDC95) was calculated using the following equation29: MDC95 = SEM × 1.96 × √2, where SEM was calculated on the basis of the standard deviation and the ICC from the test-retest reliability analysis. The MDC is the minimum amount of change in the score of an outcome measure that is unlikely due to measurement error and within-subject variability.29 A change in test scores that exceeds the MDC95 reflects a true change of a person's status with 95% confidence.


The Spearman correlation was used to examine concurrent validity of the BESTest, Mini-BESTest, and Brief-BESTest with the ABC Scale because the scores of ABC Scale were not normally distributed. The strength of correlation was determined using the guideline: little-none (ρ < 0.25), poor (ρ = 0.25-0.50), moderate-good (ρ = 0.50-0.75), and good-excellent (ρ > 0.75).29


Characteristics of Participants

Five individuals who responded to the recruitment effort were excluded from participation because they did not meet the inclusion and exclusion criteria. Two individuals who were eligible to participant did not return for a retest of the BESTest. The remaining 28 participants completed all parts of the study and their data were analyzed [time since cancer diagnosis = 6.0 ± 3.45 years; body mass index = 29.1 (6.04) kg/m2; the number of medications taken = 5.5 (4.26); Functional Comorbidity Index = 3.1 (1.67)]. Fifteen participants had received chemotherapy for treating cancer, while 6 participants had impaired plantar sensation on the basis of the results of the Semmes-Weinstein Monofilament Test. The participants on average walked at 1.04 (SD = 0.210) m/s at a usual pace and scored at 84% (SD = 12.1%) on the ABC Scale. The mean scores of the BESTest, Mini-BESTest, and Brief-BESTest were 90.0 (SD = 7.40), 22.0 (SD = 2.99), and 16.8 (SD = 3.48), respectively.

Reliability and Minimal Detectable Change

As shown in Table 1, the BESTest, Mini-BESTest, and Brief-BESTest had high test-retest reliability (ICC = 0.90-0.94; P < .001) and interrater reliability (ICC = 0.86-0.96; P < .001). The SEM and MDC95 for all tests were small. Bland-Altman plots revealed no systematic bias for all tests (Figure 1).

Figure 1:
Bland-Altman plots showing the levels of agreement for the interrater and test-retest data for the BESTest (A, D), Mini-BESTest (B, E), and Brief-BESTest (C, F) evaluated.
Table 1:
Test-retest and Interrater Reliability, SEM, and MDC95 of the BESTest, Mini-BESTest, and Brief-BESTest


The scores of the ABC Scale were significantly correlated with the scores of the BESTest (ρ = 0.73; P < .001), Mini-BESTest (ρ = 0.52; P < .01), and Brief-BESTest (ρ = 0.81; P < .001).


The evaluation of treatment outcomes must be examined using tests and measures with sound reliability, validity, and responsiveness to change. This study is the first to examine the psychometric properties of clinical balance assessment tools in older cancer survivors. The BESTest, Mini-BESTest, and Brief-BESTest showed good to excellent test-retest and interrater reliability without systematic errors and excellent concurrent validity with the ABC Scale for community-dwelling cancer survivors aged 55 years and older. Balance problems in cancer survivors are complex because adverse outcomes following the cancer diagnosis can vary depending on the underlying pathology and the cancer treatments.6,37 Current findings support the use of the BESTest and its short versions to identify specific impairments within various domains of balance control in older cancer survivors living in the community.

The high test-retest and interrater reliability of the BESTest, Mini-BESTest, and Brief-BESTest is in accordance with previous findings. Comparable levels of test-retest reliability (ICC = 0.88-0.91) and interrater reliability (ICC = 0.96) for the BESTest were reported in patients with Parkinson's disease.23,38 Excellent test-retest reliability (ICC = 0.92-0.97) and interrater reliability (ICC = 0.91-0.98) for the Mini-BESTest were also documented in patients with Parkinson's disease,38 stroke,24 and various balance disorders.22 For the Brief-BESTest, excellent interrater reliability (ICC = 0.99) was found in patients with and without a neurologic diagnosis.21 Current findings indicate that the BESTest, Mini-BESTest, and Brief-BESTest can be administered consistently in older cancer survivors. Previous studies have reported the SEM (1.26 points or 4.5% of total score) and MDC95 (3.5 points or 12.5% of total score) for the Mini-BESTest in patients with various medical diagnoses.22 This study also found small and comparable values of SEM and MDC95 for the Mini-BESTest. This study is the first to report the values of SEM and MDC95 for the BESTest and Brief-BESTest.

The excellent correlation between the ABC Scale and the BESTest, Mini-BESTest, and Brief-BESTest supports the concurrent validity of these balance tests in older cancer survivors. A significant correlation between the ABC Scale and the BESTest was previously reported in healthy individuals and in persons with various medical diagnoses (ρ = 0.64).19 Therefore, the BESTest, Mini-BESTest, and Brief-BESTest measure the significant components of balance control related to the perceived balance confidence in daily activities in older cancer survivors.

This study recruited community-dwelling older breast or prostate cancer survivors and did not include other subgroups, for example, persons who are undergoing or recently having finished cancer treatments, having other cancer diagnoses, cancer recurrence or metastasis, at advanced stage of cancer, frail, or functionally dependent. The impact of various cancer diagnoses, treatments, and sequelae on balance was beyond the scope of this pilot study. Current results may be most applicable to breast and prostate cancer survivors living in the community. These limitations should be addressed while applying the study results. In addition, the results of the Mini-BESTest and Brief-BESTest were derived from the scores of the BESTest. If the Mini-BESTest and Brief-BESTest are administered separately from the BESTest, their psychometric properties may be different from those reported in this study. Lastly, future studies should examine the concurrent validity of the BESTest and its short versions using performance-based balance tests, and identify their predictive values for fall risks in older cancer survivors.


The BESTest, Mini-BESTest, and Brief-BESTest are reliable and valid balance tests for community-dwelling older breast and prostate cancer survivors who have completed cancer treatment for at least 3 months. Clinicians can utilize the established MDC for the tests to assess the intervention outcomes. The assessment results of these balance tests can direct the treatment to target specific balance impairments in older cancer survivors.


The authors thank Dr Cindy Pfalzer for her advice on the project conceptualization and Dr Alex Borja and Dr Tracy Sweeney for their assistance with recruitment of participants and data collection.


1. History of NCCS. National Coalition for Cancer Survivorship website. Accessed September 22, 2014.
2. American Cancer Society. Cancer facts & figures 2014. Accessed September 22, 2014.
3. Parry C, Kent EE, Mariotto AB, Alfano CM, Rowland JH. Cancer survivors: a booming population. Cancer Epidemiol Biomarkers Prev. 2011;20(10):1996–2005.
4. Stubblefield MD, Schmitz KH, Ness KK. Physical functioning and rehabilitation for the cancer survivor. Semin Oncol. 2013;40(6):784–795.
5. Schmitz KH, Courneya KS, Matthews C, et al. American College of Sports Medicine roundtable on exercise guidelines for cancer survivors. Med Sci Sports Exerc. 2010;42(7):1409–1426.
6. Siegel R, DeSantis C, Virgo K, et al. Cancer treatment and survivorship statistics, 2012. CA Cancer J Clin. 2012;62(4):220–241.
7. Tofthagen C, Overcash J, Kip K. Falls in persons with chemotherapy-induced peripheral neuropathy. Support Care Cancer. 2012;20(3):583–589.
8. Winters-Stone KM, Torgrimson B, Horak F, et al. Identifying factors associated with falls in postmenopausal breast cancer survivors: a multi-disciplinary approach. Arch Phys Med Rehabil. 2011;92(4):646–652.
9. Kesler SR. Default mode network as a potential biomarker of chemotherapy-related brain injury. Neurobiol Aging. 2014;35s2:S11–S19.
10. Ward PR, Wong MD, Moore R, Naeim A. Fall-related injuries in elderly cancer patients treated with neurotoxic chemotherapy: a retrospective cohort study. J Geriatr Oncol. 2014;5(1):57–64.
11. Wampler MA, Topp KS, Miaskowski C, Byl NN, Rugo HS, Hamel K. Quantitative and clinical description of postural instability in women with breast cancer treated with taxane chemotherapy. Arch Phys Med Rehabil. 2007;88(8):1002–1008.
12. Cheville AL, Beck LA, Petersen TL, Marks RS, Gamble GL. The detection and treatment of cancer-related functional problems in an outpatient setting. Support Care Cancer. 2009;17(1):61–67.
13. Hile ES, Fitzgerald GK, Studenski SA. Persistent mobility disability after neurotoxic chemotherapy. Phys Ther. 2010;90(11):1649–1657.
14. Chen T-Y, Janke MC. Predictors of falls among community-dwelling older adults with cancer: results from the health and retirement study. Support Care Cancer. 2014;22(2):479–485.
15. Schlairet MC, Benton MJ. Quality of life and perceived educational needs among older cancer survivors. J Cancer Educ. 2012;27(1):21–26.
16. Huang MH, Lytle T, Miller KA, Smith K, Fredrickson K. History of falls, balance performance, and quality of life in older cancer survivors. Gait Posture. 2014;40(3):451–456.
17. Spoelstra S, Given B, von Eye A, Given C. Fall risk in community-dwelling elderly cancer survivors: a predictive model for gerontological nurses. J Gerontol Nurs. 2010;36(2):52–60.
18. Horak FB. Postural orientation and equilibrium: what do we need to know about neural control of balance to prevent falls? Age Ageing. 2006;35(suppl 2):ii7–ii11.
19. Horak FB, Wrisley DM, Frank J. The Balance Evaluation Systems Test (BESTest) to differentiate balance deficits. Phys Ther. 2009;89(5):484–498.
20. Franchignoni F, Horak F, Godi M, Nardone A, Giordano A. Using psychometric techniques to improve the Balance Evaluation Systems Test: the mini-BESTest. J Rehabil Med. 2010;42(4):323–331.
21. Padgett PK, Jacobs JV, Kasser SL. Is the BESTest at its best? A suggested brief version based on interrater reliability, validity, internal consistency, and theoretical construct. Phys Ther. 2012;92(9):1197–1207.
22. Godi M, Franchignoni F, Caligari M, Giordano A, Turcato AM, Nardone A. Comparison of reliability, validity, and responsiveness of the mini-BESTest and Berg Balance Scale in patients with balance disorders. Phys Ther. 2013;93(2):158–167.
23. Leddy AL, Crowner BE, Earhart GM. Functional gait assessment and Balance Evaluation System Test: reliability, validity, sensitivity, and specificity for identifying individuals with Parkinson disease who fall. Phys Ther. 2011;91(1):102–113.
24. Tsang CSL, Liao L-R, Chung RCK, Pang MYC. Psychometric properties of the Mini-Balance Evaluation Systems Test (Mini-BESTest) in community-dwelling individuals with chronic stroke. Phys Ther. 2013;93(8):1102–1115.
25. Lord SR, Menz HB. Visual contributions to postural stability in older adults. Gerontology. 2000;46(6):306–310.
26. Chen TY, Peronto CL, Edwards JD. Cognitive function as a prospective predictor of falls. J Gerontol B Psychol Sci Soc Sci. 2012;67(6):720–728.
27. Walter SD, Eliasziw M, Donner A. Sample size and optimal designs for reliability studies. Stat Med. 1998;17(1):101–110.
28. de Vet HCW, Terwee CB, Knol DL, Bouter LM. When to use agreement versus reliability measures. J Clin Epidemiol. 2006;59(10):1033–1039.
29. Portney LG, Watkins MP. Foundations of Clinical Research: Applications to Practice. 3rd ed. Upper Saddle River, NJ: Prentice-Hall Inc; 2009.
30. Faul F, Erdfelder E, Buchner A, Lang AG. Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses. Behav Res Methods. 2009;41(4):1149–1160.
31. Borson S, Scanlan J, Brush M, Vitaliano P, Dokmak A. The mini-cog: a cognitive “vital signs” measure for dementia screening in multi-lingual elderly. Int J Geriatr Psychiatry. 2000;15(11):1021–1027.
32. Borson S, Scanlan JM, Chen P, Ganguli M. The Mini-Cog as a screen for dementia: validation in a population-based sample. J Am Geriatr Soc. 2003;51(10):1451–1454.
33. Groll DL, To T, Bombardier C, Wright JG. The development of a comorbidity index with physical function as the outcome. J Clin Epidemiol. 2005;58(6):595–602.
34. Bakker K, Apelqvist J, Schaper NC. Practical guidelines on the management and prevention of the diabetic foot 2011. Diabetes Metab Res Rev. 2012;28:225–231.
35. Peters DM, Fritz SL, Krotish DE. Assessing the reliability and validity of a shorter walk test compared with the 10-Meter Walk Test for measurements of gait speed in healthy, older adults. J Geriatr Phys Ther. 2013;36(1):24–30.
36. Powell LE, Myers AM. The Activities-specific Balance Confidence (ABC) Scale. J Gerontol A Biol Sci Med Sci. 1995;50A(1):M28–M34.
37. Schmitz KH, Courneya KS, Matthews C, et al. American College of Sports Medicine roundtable on exercise guidelines for cancer survivors. Med Sci Sports Exerc. 2010;42(7):1409–1426.
38. Leddy AL, Crowner BE, Earhart GM. Utility of the Mini-BESTest, BESTest, and BESTest sections for balance assessments in individuals with Parkinson disease. J Neurol Phys Ther. 2011;35(2):90–97.

minimal detectable change; older cancer survivors; postural balance; reliability; validity

Copyright © 2016 the Section on Geriatrics of the American Physical Therapy Association