Volkman, Kathleen G. MS, PT; Stergiou, Nicholas PhD; Stuberg, Wayne PhD, PT, PCS; Blanke, Daniel PhD; Stoner, Julie PhD
INTRODUCTION AND PURPOSE
The Functional Reach Test (FRT) measures dynamic balance in a functional context of reaching forward.1 In the FRT, forward displacement occurs at a self-induced velocity while controlling the moving center of mass (COM). The FRT is a test of balance control; therefore, environmental, task and biomechanical constraints must be considered.2 For example, the coordination of muscle patterns during reaching is a consequence of the biomechanics of the reach task. Changes in task performance such as trunk rotation or protraction of the shoulder blades during reaching might affect test results. In addition, the FRT may be affected by variables such as the testing procedures and the characteristics of the subjects.
With regard to psychometric properties, the FRT has been found to be both valid and reliable in adults.1,3 In pediatric populations, the FRT has been proposed as a discriminative test and possibly as a diagnostic test to document feed-forward mechanisms of postural control.4 Although the literature shows evidence of validity of the FRT in children with typical development (TD) and children with balance impairment,5–7 its use as an evaluative tool has not been recommended in these populations because of historically poor to fair test–retest values.4
Fair test–retest reliability (intraclass correlation coefficient [ICC] = 0.75) has been established in children with TD using the traditional testing protocol.8 Published data by Volkman et al9 show test–retest reliability was improved in children with TD (ICC = 0.97–0.98) by changing the traditional method of measuring from the starting position of the hand to an alternate method of measuring from the tips of the great toes. Moreover, limits of agreement analysis showed that a change in the biomechanics of reaching with 2 arms forward, rather than 1 arm, combined with measuring reach from the toes was optimal among 4 methods tested. Therefore, improved reliability statistics were obtained by altering the biomechanics of reach and the method of measuring. Reaching with the 2-arm method moves the COM forward and decreases trunk rotation. Therefore, it was hypothesized that other task or subject variables could also affect FRT scores measured with alternate protocols and that these effects should be compared among the 4 test protocols.
Researchers have significantly correlated subject characteristics such as age and height with FRT scores in TD children. Donahoe et al8 found that 38% of the variance present in FRT scores was due to age but the addition of other anthropometric variables failed to explain more variance. Habib et al10 studied FRT in Pakistani children and found that age accounted for 17% of variance in scores and that height, weight, and base of support (BOS) also accounted for an additional 15% of variance. With regard to lower extremity strategy, Wernick-Robinson et al11 reported the strategies used by adults with and without vestibular deficits. The strategies were a “hip” strategy defined as at least 20° of hip flexion and 5° of ankle plantarflexion and “other” strategies including (1) an ankle plantarflexion strategy defined as greater than 5° ankle plantarflexion and less than 20° of hip flexion, (2) a trunk rotation strategy, and (3) a squat strategy using hip, knee, and ankle flexion. The FRT was studied using kinematic and kinetic analysis and did not follow the normal protocol of reaching along a measuring stick. The results showed no significant differences in scores between the hip strategy group and the other strategies group. There was also no significant difference in the moment arm and center of gravity displacement between the 2 groups. Although most subjects chose the hip strategy, no single characteristic of age, height, or diagnosis group described the use of other strategies. The authors concluded that the use of the hip strategy or trunk rotation strategy resulted in a low moment arm in the healthy elderly group during FRT. They suggested that this population may be less likely to approach their limits of stability than a younger group.
Given the variability of the strategies used in these studies of adults and children, the lower extremity strategy in the current study was designed to be a dichotomous variable chosen by the subjects but performed consistently from 1 test to the next for reliability purposes. The 2 strategies were (1) a “heels-down” strategy using hip flexion and heels down on the floor and (2) a “heels-up” strategy using hip flexion and heels lifted. It was proposed that the age of the children and the strategy chosen might affect the FRT scores differently depending on the protocol used. Therefore, using the traditional protocol and the alternate protocols previously established by Volkman et al,9 our first purpose was to study how FRT scores were affected by variables of method of reach (1-arm or 2-arm) and method of measurement (from finger-to-finger or from toes-to- finger). Our second purpose was to analyze the effects of subject characteristics (age, gender, height, size of BOS, and self-selected lower extremity strategy) on FRT scores when using the various protocols and to obtain normative data based on these characteristics. Our third purpose was to correlate the 1-arm style of reach with the 2-arm style as a type of criterion validity for the alternate protocol. It was hypothesized that 2-arm scores would be lower than 1-arm scores, that scores would increase with age and height, and that scores in 15- to16-year old adolescents who are nearing adult height would approximate published adult values. It was not clear how gender, lower extremity strategy, or BOS would affect scores of the alternate methods of the FRT. In addition, it was hypothesized that if the 1-arm and 2-arm FRT methods measured similar properties of balance, the scores would be well correlated.
Eighty children with TD (40 male, 40 female) were recruited by personal contact or a letter to the parents of prospective subjects in an urban and suburban metropolitan area. Based on the literature,8 subjects were divided into 3 age groups: 7 to 8 years, 11 to12 years, and 15 to16 years (Table 1) to capture data through a wide range of school-aged children and adolescents. This study was approved by the IRB committee at the University of Nebraska Medical Center and informed consent was obtained from parents and subjects before participation. A parent questionnaire requesting each child's demographic information, health conditions, and current medical treatment was used to screen the subjects. Exclusion criteria included recent history of orthopedic or neurological injury or disease, current school physical therapy treatment, and lack of active ankle range of motion in standing (subject could not lift the metatarsals off the floor during dorsiflexion in erect stance). Sixty-five of the 80 subjects were retested on the same day or within 2 weeks for reliability purposes and 4 subjects were retested within 8 weeks because of scheduling difficulties. A total of 69 subjects were retested.
The primary purpose of this study was to analyze the effects of specific variables on FRT scores using previously studied alternate protocols. A 2 × 2 factorial design was used to investigate the effect of the style of reach (1-arm or 2-arm) and the effect of measurement method (finger-to-finger or toes-to-finger). An interaction between the style of reach and measurement method was investigated. Reach measurements were made on each subject under each of the 4 possible combinations among the factor levels: 1-arm finger-to-finger (1AFF), 1-arm toes-to-finger (1ATF), 2-arm finger-to-finger (2AFF), and 2-arm toes-to-finger (2ATF). The subjects selected a heels-down or heels-up strategy and repeated it for all trials. The effects of subject-level characteristics on reach measurements, including age, height, gender, strategy, and BOS, as well as session-level characteristics including measurement order and session number, were also investigated.
Age data were analyzed by age group. Height and BOS data were further divided into quartiles for analysis to account for nonlinear associations with reach scores. Gender and lower extremity strategy were dichotomous variables. Internal validity was protected by holding constant the testing surface, the design of the apparatus, the choice of measurement tools, and the consistency of instructions provided to subjects. A learning effect was minimized by the provision of only 1 practice trial and 3 measurement trials for each style of reaching. An order effect was minimized by alternately reversing the test order in half of the subjects (1-arm test first or 2-arm test first).
The FRT protocol is clearly defined in the literature1 and consists of a subject reaching along a measuring stick with 1 arm extended with the hand in a fist, as far forward as possible without losing balance or taking a step. In this study, the positioning of the hand for measurement was changed from a fist to a pointed index finger to permit more consistent measurement of the end point of the reach. Furthermore, an alternate protocol for reaching with both arms was developed and it was compared with the traditional 1-arm FRT. The apparatus used for the testing can be seen in Figure 1. The apparatus was aligned to the vertical and horizontal planes using a box level.
Before the FRT measurements, the subject's height was measured to the nearest 0.10 cm using a metric measuring tape fixed to the wall. Then each subject stood with bare feet a comfortable width apart on a sheet of paper taped to the floor. The meter stick was located at shoulder level and attached to the frame with clips. The end was aligned with the tips of the great toes by using a carpenter's square which is a linear measuring device with a 90° angle. First, the apparatus was aligned to the vertical arm of the square by laying 1 arm of the square on the apparatus leg and the other on the floor. After this alignment, the carpenter's square was then placed against the frame at the level of the meter stick. The horizontal distance from the frame to the end of the meter stick was measured. Then the same distance was used to measure the placement of the paper from the frame on the floor. Once the paper was secured, the tips of the great toes were brought to the edge of the paper. This was the 0 point for the toe-to-finger reach score. Because the paper was taped to the floor, the front edge remained aligned with the end of the measuring stick above. The feet were then traced with a pen to ensure the same foot placement for each trial and to allow measurements of the BOS area (length × width of stance). Only hard floor surfaces were used under the paper during testing. The dominant writing hand was chosen for the reaching arm and this arm was placed closest to the measurement device during the tests.
Verbal instruction and demonstration of positioning was provided before testing. Subjects were instructed to choose 1 of 2 strategies when performing all the reach tests: (1) bending at the hips with heels-down, or (2) bending at the hips with heels-up. The subjects selected a specific strategy and repeated it for all trials. One practice trial of each style of reach was allowed. During the testing, the subjects were told to stand straight and to raise the arm/arms so the investigator could obtain the starting position measurement. Then they were told to reach as far as possible without taking a step or falling. The subjects were given verbal encouragement and reassurance that the investigator would be close by for safety.
To obtain the 1-arm FRT score, the subject leaned forward to reach (Fig. 1) and held the reaching position for approximately 3 seconds while it was measured to the nearest 0.20 cm. To obtain the 2-arm FRT score, the starting and the reaching positions were measured in a similar manner to the 1-arm reach. The 2 arms were extended forward at 90° of shoulder flexion with the hands clasped and the index fingers extended together (Fig. 2). The reach measurement was obtained at the tip of the longer index finger. Following the 3 trials for each method, the trial with the greatest score was used as each subject's FRT score since it was noted from the data that the best scores were distributed among the 3 trials. In addition to the 1-arm and 2-arm reach, an alternate method of calculating the FRT score was explored by measuring from toes-to-finger. This score was calculated from the final reach measurement obtained in the finger-to-finger method without additional trials. The method was applied to the 1-arm and the 2-arm reach tests resulting in a total of 4 measurements in the analysis.
This manuscript summarizes secondary analyses of factors associated with FRT measurements. The target sample size of 80 provided 80% power to detect standardized effect sizes of at least 0.39, a “small to medium” effect12 when comparing mean measurements within a subject from the 4 methods (assuming a 2-sided 0.008 alpha level after a Bonferroni adjustment for pair-wise comparisons among the 4 reach methods). Statistical analyses were performed using SPSS 11.5 (SPSS, Chicago, Ill) and SAS 8.02 (SAS Institute, Cary, NC) software. Means and standard deviation (SD) values were reported for demographic variables. The correlation between pairs of subject characteristics including age, height, and BOS was quantified using a Pearson's correlation coefficient and mean height, age, and BOS were compared between gender and strategy (heels-up/heels-down) groups using an independent 2 sample t test. All statistical tests were evaluated at the 0.05 alpha level.
The first purpose of this article, to investigate the effect of style of reach and method of measurement on reach scores, was evaluated using a repeated measures ANOVA method to estimate the effect of the style of reach (1-arm or 2-arm), the effect of the measurement method (finger-to-finger or toes-to-finger), and their interaction by fitting a fixed style of reach factor with 2 levels, a fixed measurement method factor with 2 levels, and a fixed interaction term. The repeated measures ANOVA was used to account for the correlation among repeated observations taken on the same subject (reach measurements were made on each subject under the 4 different methods where the measurements were repeated a second time for most subjects). A compound symmetric correlation structure was assumed since most repeated observations were made in a single day. To interpret the significant interaction between the style of reach and the measurement method, a single method factor with 4 levels (1AFF, 2AFF, 1ATF, and 2ATF) was fit and post hoc multiple pair-wise comparisons among the mean reach scores for the 4 methods were made using Tukey's method.
The effect of measurement method on FRT scores was investigated first. Then, in situations where there was a significant interaction between subject characteristics and FRT method (indicating that the effect of reach method differed depending on particular subject characteristics), subgroup analyses were performed. Possible interactions between measurement method and subject characteristics were considered separately for the 5 variables of age, gender, height, strategy, and BOS. When a significant interaction was identified between subject characteristic and FRT method, the effect of the subject characteristic on the reach score was investigated separately within each of the 4 FRT method groups.
The second purpose was to study the effects of subject characteristics on FRT scores according to the 4 methods and to obtain normative data from this sample and categorize it by subject characteristic. Least squares mean and standard error values, which were estimated using the repeated measures ANOVA models to account for the correlation among repeated measures on the same subject, are presented according to height categories (Table 2). A repeated measures ANOVA method was used to compare mean reach scores for the 4 methods among groups defined by subject characteristics (age category, height quartile, gender, strategy, and BOS area quartile), between measurement order (1-arm first or 2-arm first), and between practice sessions (first or second measure). Height and BOS were categorized using quartiles of the observed distributions because height and BOS were nonlinearly related to functional reach measures. ANOVA models included main effects of subject and session characteristics, main effects of the reach method, and 2-way interactions between the subject or session factors and the reach method factor. Subject characteristics were first investigated separately because of the large number of characteristics investigated and the limited sample size. Using age as an example, an ANOVA model was fit that included age (3 levels), reach method (4 levels), and an age by reach method interaction. An interaction with reach method was observed for a number of subject characteristics, as described in the Results section, so the effect of the subject and session characteristics on the reach score was investigated separately within each of the 4 FRT method groups. Age, height, and BOS area were highly correlated. Therefore, to investigate the association between reach scores and subject characteristics, the size factor most highly associated with reach measurement, height, was investigated further. An ANOVA model that included a height factor (4 levels), strategy factor (2 levels), gender factor (2 levels), order factor (2 levels), and session number factor (2 levels) was fit separately for each reach method to investigate the association between reach scores and subject or test session characteristics.
The third purpose of correlating the 2AFF data with the 1AFF data was evaluated using a Pearson correlation coefficient to establish criterion validity of the 2-arm method using data only from the first set of measurements.
How FRT Scores Are Affected by Style of Reach and Method of Measurement
Mean reach scores were compared among the 4 FRT methods. Summarizing across height, gender and strategy categories, the least squares mean ± standard error values for the reach scores were 30.92 ± 0.80 cm for the 1AFF approach, 82.32 ± 1.38 cm for the 1ATF approach, 31.08 ± 0.79 cm for the 2AFF approach, and 76.02 ± 1.28 for the 2ATF approach. There was a significant interaction between the 2 variables of style of reach (1-arm or 2-arm) and also measurement method (finger-to-finger or toes-to-finger) (F = 78.08, df = 1, 79, p < 0.0001). Therefore, the measurement method was considered as a single treatment variable with 4 levels (1AFF, 1ATF, 2AFF, 2ATF). After adjusting for multiple comparisons, the 1ATF mean was significantly greater than the 2ATF mean (p < 0.001) by about 6 to 7 cm. A similar difference between the 1AFF mean and the 2AFF mean is not evident in the scores (Table 2). All other pair-wise differences were significant after adjustment for the pair-wise comparisons (p < 0.0001).
Mean age and mean height did not differ significantly between males and females (p = 0.5), but mean BOS area was significantly greater for males than females (p = 0.02). Height was significantly correlated with age (r = 0.92, p < 0.0001) and BOS area (r = 0.83, p < 0.0001), and age was significantly correlated with BOS area (r = 0.78, p < 0.0001). The age groups were found to be linearly distributed throughout the height quartiles. For example, 100% of the subjects in the shortest height quartile (<130.2 cm) were 7 to 8 years old and in the 130.2 to 148 cm quartile, 45% were 7 to 8 years old, and 55% were 11 to 12 years old. The use of 2 strategies were roughly balanced among age groups and mean age did not differ significantly between subjects who chose the heels-up strategy (p > 0.9), nor did mean height (p > 0.9), nor mean BOS area (p = 0.5).
How FRT Scores Are Affected by Subject and Session Characteristics
Descriptively, mean scores according to method increased significantly with height except for the 2AFF method in the 2 tallest categories (Table 2). Based on tests of interactions, the effect of method differed significantly across age groups (F = 77.58, df = 6, 231, p < 0.0001), across height categories (F = 51.79, df = 9, 228, p < 0.0001), and across BOS area categories (F = 35.26, df = 9, 228, p < 0.0001), but did not differ according to gender (F = 0.79, df = 3, 234, p = 0.5), heels-up/heels-down strategy (F = 0.34, df = 3, 6, p = 0.8), measurement order (1-arm first or 2-arm first, F = 0.47, df = 3, 181, p = 0.7), or session (first or second session, F = 0.58, df = 3, 204, p = 0.6). Given the interaction between method and subject characteristics of age, height, and BOS, the effect of subject and session characteristics on reach scores was investigated separately for each method. Age, height, and BOS area were highly correlated. To address issues of colinearity, subsequent modeling focused only on height as a subject size factor, because height was most highly associated with reach scores. Akaike's information criterion from a repeated measures ANOVA model using measurements from all reach scores was 5473.8 for height compared with 5487.4 for age and 5518.2 for BOS, where lower numbers indicate better model fit. Age and BOS area were not considered further.
Under the 1AFF method, mean reach differed significantly according to height (p < 0.0001), but not according to gender (p = 0.9) or strategy (p = 0.3). Mean reach scores increased across the height groups where all pair-wise comparisons were significant (p < 0.04 for each) except for the comparison between the 2 tallest groups which was marginal (p = 0.06). Mean scores did not differ significantly depending on order (p = 0.70), but were higher on average by 1.03 cm at the initial session compared with the second (p = 0.05).
Under the 1ATF method, mean reach differed significantly according to height (p < 0.0001), but not according to gender (p = 0.6) or strategy (p = 0.09). Mean reach scores increased across the height groups where all pair-wise comparisons were significant (p < 0.0001 for each). Mean scores did not differ significantly with respect to order (p = 0.9) or session (p = 0.1).
Under the 2AFF method, mean reach differed significantly according to height (p < 0.0001), but not according to gender (p = 0.9) or strategy (p = 0.3). Mean reach scores increased across the height groups where all pair-wise comparisons were significant (p < 0.003 for each) except for the comparison between the 2 tallest groups which was not significant (p = 0.9). Mean scores did not differ significantly with respect to order (p = 0.2) or session (p = 0.7).
Under the 2ATF method, mean reach differed significantly according to height (p < 0.0001), but not according to gender (p = 0.2) or strategy (p = 0.2). Mean reach scores increased across the height groups where all pair-wise comparisons were significant (p < 0.0001 for each). Mean scores were significantly higher by 0.78 cm when the 2-arm approach was the second approach in a session (p = 0.01) but did not differ with respect to session (p = 0.8).
How 1-Arm and 2-Arm FRT Scores are Correlated
There was a high correlation between the 1AFF and 2AFF reach scores (Pearson r = 0.84, p < 0.001).
Our first purpose was to investigate the effect of style of reach and method of measurement on reach scores. The significant interaction between the style of reach and the measurement method confirmed the hypotheses that changing the biomechanics and measurement method would significantly affect FRT scores. The 2-arm method was proposed to increase the difficulty of the test while decreasing the variability of the trunk between tests, and thus improve reliability. Use of the 2-arm reach resulted in shorter FRT scores by 5 to 7 cm in the 2ATF as expected (Table 2). However, the finger-to-finger methods do not reflect this difference (p = 0.9). Volkman et al9 calculated the starting position of the hand in the finger-to-finger protocols and found a similar amount of difference between 1AFF and 2AFF scores indicating a backward shift of the COM which was not accounted for in the scores. It was, therefore, evident that a change in the biomechanics of reach might not be reflected by the FRT score using the traditional starting position of the hand. Using the toes-to-finger method eliminated the variable of sway which apparently occurred when moving the arms forward.
Therefore, given the improved reliability of toes-to-finger methods9 and the increased demand of more body mass forward, it is suggested that the 2ATF method may be a plausible alternative meriting further study. It would be helpful to verify the amount of backward sway and the change in the center of pressure which occurs during the 2-arm method in youth. Although our purpose was not to study the validity of the FRT, increasing the challenge of the test could improve its psychometric properties.
Our second purpose was to study the effects of subject characteristics on FRT scores according to the 4 methods and to categorize the normative data from this sample by subject characteristic. Age, height, and BOS were significantly correlated. This is expected due to growth in children. Although increased height is not correlated with increased FRT scores in adults,1 Habib et al10 found height to be an important factor in FRT scores of Pakistani children in contrast to Donahoe et al8 who found that age alone was the main factor influencing FRT in US children. The difference in the current study could be explained by the larger height range of subjects (44.6 cm) compared with the study by Donahoe (31.7 cm) despite a similar age range. Additionally, the current study had 15 more subjects in the oldest age group, who had more variability in their heights.
Although the method of reach interacted significantly with height, age, and BOS, further analysis considered only height because of confounding effects of variables. The FRT scores increased along with height under the 4 methods as hypothesized. Under the toes-to-finger methods, each height quartile showed significant differences in FRT scores. Therefore, the toes-to-finger methods could be used with scores categorized according to height groups while also demonstrating improved reliability. Study on other samples of children could further delineate the appropriate height categories.
The heights of this sample were comparable to mean heights-by-age government data on children from 1999 to 2002.1 Because the age groups in this study were linearly distributed throughout the height categories, the 1AFF mean scores under the height groups were compared with the means by age group published by Donahoe et al.8 In the current study, the shortest group had a mean reach slightly less than the published 24.21 cm reach of 7- to 8-year-old children. The 148.1- to 168.5-cm height group (aged 11 to 16 years) had a mean reach slightly more than that of 11- to12-year-old children (32.79 cm) and that of 13- to15-year-old adolescents (32.30 cm). The tallest group (both male and female) had a mean reach of 37.38 cm which is between the published values of 36.60 cm for adult females and 41.83 cm for adult males.1 Although previous published studies1,8,10,13 have reported FRT scores by age groups, our results imply that height rather than age, may be more advantageous when using the FRT for discriminative purposes in children. This effect could be significant among youth with medical conditions which affect their growth.
In the height groups, gender was fairly well balanced except for the tallest group where there were more males. Because there was not a difference between male and female scores in the various height categories, regardless of the method, we conclude that gender did not have a significant effect on these subjects' reach scores. This finding is in agreement with other published data.8,10 Categorizing FRT scores by height groups may be preferred since mean height can differ significantly between adolescent males and females (∼13 cm at age 16 years).14 The use of height groups to categorize scores would avoid this disparity.
With regard to lower extremity strategy, subjects did not choose their strategy based on age or height. Use of heels-up or heels-down strategies did not significantly affect scores under any method, though it seems intuitive that heels-up should be a more difficult position. A possible explanation is that if children view a strategy as more difficult, some may choose the easier strategy to try to get the furthest reach or they may not reach as far to the limits of stability during the more challenging test. Further study on intrasubject differences using the 2 strategies in a larger sample is needed.
Our third purpose was to correlate the 1-arm and 2-arm protocols. The hypothesis that 1-arm and 2-arm reach methods would be correlated was supported by the data. Seventy-one percent of the variation in scores was explained by the difference between reaching with 1 arm and 2 arms. This further supports the psychometric properties of the 2-arm method, given the slightly better reliability of the 2ATF compared with the 1ATF.9
Finally, our results and discussion should be viewed in lieu of the limitations, including sample size and the use of a sample of convenience. A larger sample would have permitted additional analyses of the variables' effects on FRT.
In summary, this study analyzed the effects of several variables on FRT scores in children with TD, including a 2-arm method of reaching and a method of measuring FRT from the toes to the fingers of the reaching hand. It also provided normative data for discriminative purposes based on height categories, in contrast to previously published data using age groups. Given the improved reliability using toe-to-finger methods, the 1ATF and the 2ATF method appear to have better psychometric properties for evaluative purposes. Additional research is needed to examine the validity of whether the 2-arm FRT is a more challenging balance test in children with balance impairments.
© 2009 Lippincott Williams & Wilkins, Inc.