INTRODUCTION AND PURPOSE
The Gross Motor Function Measure (GMFM) is an observational and criteria-referenced outcome measure designed and validated to assess changes over time in the gross motor function of children with cerebral palsy (CP).1 The original English version of the GMFM (GMFM-88) consists of 88 items in 5 dimensions: (A) lying and rolling, (B) sitting, (C) crawling and kneeling, (D) standing, and (E) walking, running, and jumping. The GMFM-88 total score is calculated as the average of the 5 dimensions. It is an instrument with an evaluative purpose whose validity, reliability, and responsiveness have been demonstrated through numerous studies.2–6 Its original English version has excellent intra and interrater reliability for both the 88-item version (intraclass correlation coefficient [ICC] = 0.87-0.99) and the 60-item version (ICC = 0.96-0.99).7–11 The GMFM is considered a “gold standard” for the motor assessment of children with CP and is widely used in both clinical practice and research. Recently, it has been cross-culturally adapted to Korean,12,13 Portuguese,14 Persian,15 and Spanish16 populations, among others.
The forward-backward translation method was used for translation and cross-cultural adaptation of the original English version of the GMFM to the Spanish population of children with CP. The score sheet and the instructions were translated using omission, incorporation, substituting of words, or contribution of examples strategies. The resulting versions were subjected to a qualitative analysis of cultural and linguistic equivalence through a committee of experts, including one author of the original English version. In addition, understandability, applicability and viability were assessed through a pilot study in which evaluators and participants with heterogeneous profiles participated. The Spanish version of the GMFM (GMFM-SP) maintains the highest degree of equivalence concerning the original English version and guarantees the comprehensibility of all professionals regardless of their professional experience or geographical origin.
Regarding psychometric properties of the GMFM-SP, during the cross-cultural adaptation process, it was observed that neither cultural nor experiential load could condition the title or content of the items. Together with the methodological guidelines followed and the high degree of equivalence of the GMFM-SP concerning the original English version, this aspect guarantees the stability of some of its psychometric properties as content and construct validity.1,9 Once these properties were established, it was considered necessary to verify other psychometric properties subject to possible variation concerning the original English version.
The study of reliability properties was carried out to determine the degree to which a tool measured without error, establishing the proportion of total variance attributable to true differences between participants.17 This study allowed us to know the degree of variation in scores attributable to the raters (interrater reliability) or to the same rater on different occasions (intrarater reliability).18 However, for the recently translated and adapted Spanish version (GMFM-SP), its psychometric properties have not yet been studied; thus, it was necessary to verify that the results coincide with those described in the original English version.
This study examined the intra and interrater reliability of the 88-item GMFM-SP version in the Spanish population of children and adolescents with CP. A secondary objective was to examine the correlation between the tool and the age and Gross Motor Function Classification System (GMFCS) level of these children.
METHODS
Participants
This study included a convenience sample of 51 participants (18 females, 33 males; age range of 5 months to 16 years 7 months, mean 9 years 4 months [SD 4 years 8 months]) selected from associations, private clinical centers, early care centers, and schools from several regions of Spain (Table 1). Inclusion criteria were (1) children between 5 months and 16 years old, (2) with CP diagnosis confirmed by a medical report, and (3) able to connect with the environment to capture and maintain their attention through visual or verbal stimuli that allow the therapist to guide the assessment. Exclusion criteria were (1) having received intrathecal baclofen or (2) botulinum toxin type A injections within the past 6 months or (3) orthopedic or neurologic surgery within 1 year. All parents provided written informed consent and permission to record audiovisual content; participants also gave assent before the assessment.
TABLE 1 -
Participant Characteristics (n = 51)
Characteristics |
|
Age, (range) mean [SD] |
(5 mo to 16 y 7 mo) 9 y 4 mo [4 y 8 mo] |
Sex, n (%) |
|
Female |
18 (35.2) |
Male |
33 (64.7) |
GMFCS level, n (%) |
|
I |
22 (43.1) |
II |
10 (19.6) |
III |
5 (9.8) |
IV |
8 (15.6) |
V |
6 (11.7) |
Type of cerebral palsy, n (%) |
|
Spastic tetraplegia |
11 (21.5) |
Spastic triplegia |
3 (5.8) |
Spastic diplegia |
6 (11.7) |
Spastic hemiplegia |
14 (27.4) |
Dystonic/athetotic |
3 (5.8) |
Ataxic |
4 (7.8) |
Hypotonic |
5 (9.8) |
Mixed |
5 (9.8) |
Centre of origin, n (%) |
|
Association 1 |
25 (49) |
Association 2 |
3 (5.8) |
Association 3 |
3 (5.8) |
Early care center |
7 (13.7) |
Clinical private center 1 |
5 (9.8) |
Clinical private center 2 |
4 (7.8) |
Clinical private center 3 |
2 (3.9) |
School center |
2 (3.9) |
Abbreviations: GMFCS, Gross Motor Function Classification System; SD, standard deviation.
Measurements were carried out by 8 raters, including 6 physical therapists and 2 physical therapists and occupational therapists, with different levels of professional experience in pediatric rehabilitation (range 2-21 years, mean 13 years [SD 6 years]). All had experience using outcome measures, but only 2 had used other nonofficial Spanish versions of the GMFM previously in a clinical setting (Table 2).
TABLE 2 -
Participant Characteristics
Rater |
Reliability Study |
Professional Background |
Professional Experience, y |
Outcome Measure Experience |
GMFM Experience |
1 |
Inter-/intra- |
PT/OT |
15 |
Y |
Y |
2 |
Inter-/intra- |
PT |
21 |
Y |
N |
3 |
Inter- |
PT |
2 |
Y |
N |
4 |
Intra- |
PT |
21 |
Y |
Y |
5 |
Inter- |
PT |
17 |
Y |
N |
6 |
Intra- |
PT/OT |
13 |
Y |
N |
7 |
Intra- |
PT |
8 |
Y |
N |
8 |
Intra- |
PT |
15 |
Y |
N |
Abbreviations: GMFM, Gross Motor Function Measure; inter, interrater reliability; intra, intrarater reliability; N, no; OT, occupational therapy; PT, physical therapy; Y, yes.
Ethical approval was obtained through the Ethical Committee of the Universidad Católica de Murcia. This study complies with the Declaration of Helsinki.
Procedure
All participants were assessed barefoot, without assisting devices and in a familiar physical therapy room. A baseline assessment using the 88-item GMFM-SP version (GMFM-SP-88) was performed and videotaped by an assistant. The videos were recorded according to a protocol that specified camera placement, shot type, and dynamic or static recording mode; the camera was placed in an area not visible to the child, and videotaping began after some games. The administration time ranged from 45 to 60 minutes depending on the rater skills and the child cooperation level.
For the intra- and interrater reliability study, a total of 100 video recordings were viewed independently by a total of 8 raters who were not involved in the recording of the baseline assessment. Four assessors participated in the interrater reliability portion of this study by reviewing 50 video recordings. Two of the 4 assessors involved in the interrater reliability study, with 4 additional assessors viewed 50 video recordings twice (at least 1 month apart) for the intrarater reliability portion of this study (Table 2). For 1 child whose parents did not authorize the video recording of the evaluations, only interrater reliability was studied, assessing in situ in the same week as the baseline assessment was performed. Intrarater assessments were conducted based on randomly selected video recordings, viewed at an interval of at least 4 weeks from the first assessment to avoid recall bias and with the previous information concealed. For the interrater reliability study, ratings were independent, and none of the therapists had access to the baseline assessment scores.
Raters attended a 6-hour training workshop on the administration and scoring of the GMFM-SP-88. After completing the assessment, they must have achieved at least 70% concordance according to the score assigned by a senior assessor, to participate as a rater in this study.19 In addition to this, the degree of agreement in the total scores of each evaluator was checked with a reference expert assessor on a sample of 6 children with CP, using the Bland-Altman method (see Supplemental Digital Content 1, available at: https://links.lww.com/PPT/A353) with limits of agreement (LoA) below 2 points (out of 100) for all raters.
Statistical Analysis
Although the sample size (n = 51) allows assuming a normal distribution, significant deviations were checked with the standard errors of the asymmetry and kurtosis coefficients and the Q-Q charts of normality, and with the Kolmogorov-Smirnov test. Parametric tests were applied for all variables, and descriptive statistics to summarize the data for each of the evaluators were mean, standard deviation, range, and quartiles. The analyses were conducted with the GMFM-SP-88 total and dimensions scores.
To determine intra- and interrater reliability of the GMFM-SP-88 total and dimension scores, the intraclass correlation coefficient (ICC2,1) and 95% confidence intervals (CIs) with the 2-way random-effect model were used.20,21 ICCs higher than 0.90 were considered excellent, between 0.90 and 0.75 good, between 0.50 and 0.75 moderate, and below 0.50 poor.22 Measurement precision was evaluated using standard error of measurement (SEm) [SEm = SD⋅√(1 − ICC)] and its relative value concerning the average of all measurements and the smallest real difference (SRD) [SRD = 1.96⋅SEm⋅√2].17 The LoA were calculated according to the method described by Bland and Altman23 and the presence of summative or multiplicative biases with Passing-Bablock's linear regression method.24
A linear regression model was created to explore scores concerning age and GMFCS level (levels I-V). Beta coefficients and Pearson's partial and semipartial correlation coefficients were calculated with their corresponding determination coefficients to assess the magnitude of the relationship. The analyses were conducted using the SPSS software (version 19.0; SPSS Inc, Chicago, Illinois) and the jmv package (version 0.9) for R (version 3.5.0; 2018).
RESULTS
GMFM-SP-88 Total and Dimension Scores
The median total score was 81.8, and half of the children scored between 43.5 and 95.5 points. The E dimension had the lowest score. In contrast, the highest scores were in the A and B dimensions (Table 3). The coefficients of variation ranged from 32% to 75% that are explained due to the sample heterogeneity.
TABLE 3 -
GMFM Total and Dimension Score Segmented by Age and GMFCS Level
|
GMFM Dimensions |
Mean (SD) |
CV |
Range |
Median (IQR) |
Total (n = 51) |
A. Lying and rolling |
83.4 (27.1) |
32% |
2-100 |
98 (80.4-100) |
|
B. Sitting |
77.2 (31.8) |
41% |
5-100 |
95 (66.7-100) |
|
C. Crawling and kneeling |
65.8 (38.5) |
59% |
0-100 |
88.1 (36.9-100) |
|
D. Standing |
58.3 (39.5) |
68% |
0-100 |
76.9 (11.5-93.6) |
|
E. Walking, running, jumping |
51.7 (38.7) |
75% |
0-100 |
56.9 (6.9-88.2) |
|
Total |
67.3 (33.1) |
49% |
1.7-100 |
81.8 (43.5-95.5) |
Segmented by age |
<8 y (n = 24) |
A. Lying and rolling |
80.2 (25.7) |
32% |
21.6-100 |
94.1 (70.1-100) |
|
B. Sitting |
72.8 (30.3) |
42% |
5-100 |
82.5 (60.4-100) |
|
C. Crawling and kneeling |
55.8 (38) |
68% |
0-100 |
53.6 (28.6-95.8) |
|
D. Standing |
52.2 (38.1) |
73% |
0-97.4 |
57.7 (7.7-89.7) |
|
E. Walking, running, jumping |
42.7 (35.9) |
84% |
0-94.4 |
38.9 (4.2-82.3) |
|
Total |
60.7 (31.4) |
52% |
5.3-97.8 |
61 (35.5-92.3) |
≥8 y (n = 27) |
A. Lying and rolling |
86.3 (28.5) |
33% |
2-100 |
98 (92.2-100) |
|
B. Sitting |
81 (33.1) |
41% |
6.7-100 |
98.3 (80.8-100) |
|
C. Crawling and kneeling |
74.8 (37.4) |
50% |
0-100 |
97.6 (54.8-100) |
|
D. Standing |
63.7 (40.7) |
64% |
0-100 |
87.2 (15.4-94.9) |
|
E. Walking, running, jumping |
59.8 (40) |
67% |
0-100 |
81.9 (15.3-94.4) |
|
Total |
73.1 (34) |
47% |
1.7-100 |
91 (51.9-97.4) |
Segmented by GMFCS level |
I (n = 22) |
A. Lying and rolling |
94.8 (15.1) |
16% |
37.3-100 |
100 (98-100) |
|
B. Sitting |
96.1 (9.19) |
10% |
61.7-100 |
100 (98.8-100) |
|
C. Crawling and kneeling |
90.7 (18.4) |
20% |
38.1-100 |
100 (89.3-100) |
|
D. Standing |
91.6 (10.5) |
11% |
51.3-100 |
94.9 (90.4-97.4) |
|
E. Walking, running, jumping |
86.2 (17.2) |
20% |
37.5-100 |
90.3 (85.4-98.3) |
|
Total |
91.9 (13.3) |
14% |
51.2-100 |
96.4 (93.6-98.8) |
II (n = 10) |
A. Lying and rolling |
95.3 (6.42) |
7% |
84.3-100 |
99 (90.2-100) |
|
B. Sitting |
95.3 (7.81) |
8% |
75-100 |
98.3 (95.4-100) |
|
C. Crawling and kneeling |
85 (21.6) |
25% |
42.9-100 |
96.4 (85.7-97.6) |
|
D. Standing |
75.6 (11.5) |
15% |
53.8-89.7 |
78.2 (70.5-81.4) |
|
E. Walking, running, jumping |
58.5 (19.3) |
33% |
27.8-86.1 |
59 (44.8-72.2) |
|
Total |
81.9 (12.2) |
15% |
57.2-94.7 |
83.9 (80.5-90.1) |
III (n = 5) |
A. Lying and rolling |
88.2 (17.8) |
20% |
56.9-100 |
96.1 (92.2-96.1) |
|
B. Sitting |
77.3 (15.3) |
20% |
56.7-96.7 |
81.7 (68.3-83.3) |
|
C. Crawling and kneeling |
60 (30.6) |
51% |
14.3-95.2 |
59.5 (52.4-78.6) |
|
D. Standing |
31.3 (15.4) |
49% |
15.4-51.3 |
25.6 (20.5-43.6) |
|
E. Walking, running, jumping |
23.9 (15) |
63% |
4.2-43.1 |
22.2 (16.7-33.3) |
|
Total |
56.1 (17.2) |
31% |
30.5-73 |
53.9 (52.5-70.9) |
IV (n = 8) |
A. Lying and rolling |
81.6 (11.3) |
14% |
64.7-96.1 |
80.4 (74-91.2) |
|
B. Sitting |
53.8 (22.1) |
41% |
11.7-78.3 |
58.3 (41.3-70.4) |
|
C. Crawling and kneeling |
26.5 (21.8) |
82% |
0-57.1 |
32.1 (5.4-38.7) |
|
D. Standing |
5.8 (5.08) |
88% |
0-15.4 |
6.4 (1.9-7.7) |
|
E. Walking, running, jumping |
4.5 (4.74) |
105% |
0-13.9 |
3.5 (1-6.3) |
|
Total |
34.4 (11.5) |
33% |
15.3-49.9 |
34.7 (28.9-42.8) |
V (n = 6) |
A. Lying and rolling |
20.3 (14.4) |
71% |
2-41.2 |
23.5 (9.8-25.5) |
|
B. Sitting |
8.6 (2.87) |
33% |
5-11.7 |
8.3 (6.7-11.3) |
|
C. Crawling and kneeling |
0 (0) |
0% |
0-0 |
0 (0-0) |
|
D. Standing |
0 (0) |
0% |
0-0 |
0 (0-0) |
|
E. Walking, running, jumping |
0 (0) |
0% |
0-0 |
0 (0-0) |
|
Total |
5.8 (3.31) |
57% |
1.7-10.6 |
6.2 (3.2-7.4) |
Abbreviations: CV, coefficient of variation; GMFCS, Gross Motor Function Classification System; GMFM, Gross Motor Function Measure; IQR, interquartile range; SD, standard deviation.
Concerning the 2 established age groups (Table 3 and Figure 1), higher median values were found in older children in the total score and all dimensions, especially those requiring more skills.
Fig. 1.: Box plot and jitter plot for
GMFM-SP score and segmented by age and GMFCS level. GMFCS indicates Gross Motor Function Classification System;
GMFM-SP, Spanish version of the Gross Motor Function Measure. This figure is available in color online (
www.pedpt.com).
The scores segmented according to the severity level (GMFCS) had an expected decrease in the dimensions and total score according to the progression. The medians for levels I and II were above 80 points, levels III and IV were below 50 points, level V children had median values of around 6 points, and some of the dimensions had a score of 0 because they could not be assessed (Table 3 and Figure 1).
Intrarater Reliability
The intrarater reliability analysis involved 6 raters who independently viewed 50 recordings corresponding to the baseline assessment. The assessors rated the recording blindly concerning the score obtained in the one made in situ, with an average time between 2 sessions was 2.6 months (SD 1.43 months) ranging from 1 to 7 months.
The ICCs were excellent for all dimensions (0.99, 95% CI 0.99-1.00) and for the total scores (1.00, 95% CI 0.99-1.00) as well as for the different age (1.00, 95% CI 0.99-1.00) or GMFCS level (0.99, 95% CI 0.995-1.00) (Table 4). The minimum limit of the lower ICC CI was 0.995, corresponding to GMFCS level II.
TABLE 4 -
Intrarater
Reliability
Dimension GMFM-88 |
Mean (SD) 1 |
Mean (SD) 2 |
t Value |
P Value |
d Cohen |
MD (95% LoA) |
ICC (95% CI) |
SEM |
SEM% |
SRD |
SRD% |
A. Lying and rolling |
84.3 (26.67) |
84.4 (26.65) |
−1.00 |
.322 |
0.00 |
−0.08 (−1.17 to 1.01) |
1.0 (1.0-1.0) |
0.02 |
0.02% |
0.05 |
0.05% |
B. Sitting |
78.5 (30.67) |
78.5 (30.68) |
−1.00 |
.322 |
0.00 |
−0.03 (−0.5 to 0.43) |
1.0 (1.0-1.0) |
0.02 |
0.02% |
0.05 |
0.06% |
C. Crawling and kneeling |
67.1 (37.74) |
67.1 (37.74) |
0.00 |
1.000 |
0.00 |
0 (−0.94 to 0.94) |
1.0 (1.0-1.0) |
0.02 |
0.03% |
0.05 |
0.08% |
D. Standing |
59.5 (39.05) |
59.2 (38.93) |
1.30 |
.200 |
0.01 |
0.26 (−2.48 to 2.99) |
0.999 (0.999-1.0) |
0.20 |
0.33% |
0.55 |
0.92% |
E. Walking, running, jumping |
52.8 (38.42) |
52.8 (38.45) |
−0.30 |
.768 |
0.00 |
−0.03 (−1.31 to 1.26) |
1.0 (1.0-1.0) |
0.02 |
0.04% |
0.05 |
0.10% |
Total |
68.4 (32.38) |
68.4 (32.32) |
0.55 |
.587 |
0.01 |
0.15 (−1.65 to 1.94) |
1.0 (1.0-1.0) |
0.02 |
0.03% |
0.05 |
0.07% |
GMFCS |
|
|
|
|
|
|
|
|
|
|
|
I |
91.9 (13.33) |
91.9 (13.32) |
−0.18 |
.858 |
0.00 |
0 (−0.22 to 0.21) |
1.0 (1.0-1.0) |
0.04 |
0.04% |
0.10 |
0.11% |
II |
81.9 (12.22) |
81.7 (12.17) |
1.63 |
.137 |
0.02 |
0.28 (−0.79 to 1.35) |
0.999 (0.995-1.0) |
0.11 |
0.13% |
0.30 |
0.37% |
III |
56.1 (17.16) |
56.2 (17.22) |
−1.00 |
.374 |
0.00 |
−0.06 (−0.3 to 0.19) |
1.0 (1.0-1.0) |
0.04 |
0.07% |
0.11 |
0.20% |
IV |
34.4 (11.49) |
34.6 (11.47) |
−1.47 |
.185 |
0.01 |
−0.16 (−0.75 to 0.44) |
1.0 (0.998-1.0) |
0.01 |
0.03% |
0.03 |
0.08% |
V |
4.8 (2.61) |
4.8 (2.61) |
0.00 |
.999 |
0.00 |
0 (0 to 0) |
1.0 (1.0-1.0) |
0.05 |
1.06% |
0.14 |
2.93% |
Age |
|
|
|
|
|
|
|
|
|
|
|
<8 y |
62.9 (30.20) |
62.4 (30.11) |
0.08 |
.934 |
0.00 |
0.01 (−0.59 to 0.6) |
1.0 (1.0-1.0) |
0.02 |
0.03% |
0.05 |
0.08% |
≥8 y |
73.1 (33.98) |
73.1 (33.95) |
0.65 |
.520 |
0.00 |
0.04 (−0.57 to 0.65) |
1.0 (1.0-1.0) |
0.02 |
0.03% |
0.05 |
0.07% |
Abbreviations: d Cohen, effect size; GMFCS, Gross Motor Function Classification System; GMFM-88, 88-item Gross Motor Function Measure; ICC (95% CI), intraclass correlation coefficient (95% confidence interval); MD (95% LoA), mean of differences (95% limits of agreement); SD, standard deviation; SEM, standard error of mean; SRD, smallest real difference; t value, T de Student statistics for related samples.
Regarding the limits according to the measurements, differences in scores were distributed very close to the line of zero differences and no patterns were detected in any of the dimension scores analyzed (see Supplemental Digital Content 2, available at: https://links.lww.com/PPT/A354, and Supplemental Digital Content 3 available at: https://links.lww.com/PPT/A355, and Table 4). The average differences were, in all cases, below 1 point and the widest agreement limits corresponded to dimension D with an interval between approximately ±3 points. The total score had an average difference of 0.1 points (95% LoA, −1.65 to 1.94 points).
Since the ICCs were very high, the SEm were very close to zero, and the SRDs were below 1 point.
Interrater Reliability
The interrater reliability analysis involved 4 raters who independently viewed 50 recordings corresponding to the baseline assessment. In addition, 1 in situ assessment of 1 child was performed.
The ICCs were excellent for all dimensions (0.99, 95% CI 0.998-1.00) and for the total scores (1.00, 95% CI 0.99-1.00), as well as for the different age (1.00, 95% CI 0.99-1.00) or GMFCS level (0.99, 95% CI 0.993-1.00). The minimum limit of the lower ICC CI was 0.993, corresponding to GMFCS level II (Table 5).
TABLE 5 -
Interrater
Reliability
Dimension GMFM-88 |
Mean (SD) 1 |
Mean (SD) 2 |
t Value |
P Value |
d Cohen |
MD (95% LoA) |
ICC (95%IC) |
SEM |
SEM% |
SRD |
SRD% |
A. Lying and rolling |
83.4 (27.09) |
83.7 (27.08) |
−1.43 |
.159 |
0.01 |
−0.23 (−2.49 to 2.03) |
0.999 (0.998-0.999) |
0.16 |
0.20% |
0.46 |
0.55% |
B. Sitting |
77.2 (31.77) |
77.1 (31.71) |
0.24 |
.811 |
0.00 |
0.03 (−1.87 to 1.94) |
1.0 (0.999-1.0) |
0.02 |
0.02% |
0.05 |
0.06% |
C. Crawling and kneeling |
65.8 (38.53) |
65.7 (38.55) |
0.70 |
.485 |
0.00 |
0.09 (−1.76 to 1.95) |
1.0 (0.999-1.0) |
0.02 |
0.03% |
0.05 |
0.08% |
D. Standing |
58.3 (39.54) |
58.5 (39.89) |
−0.68 |
.497 |
0.00 |
−0.15 (−3.23 to 2.93) |
0.999 (0.999-1.0) |
0.20 |
0.34% |
0.55 |
0.95% |
E. Walking, running, jumping |
51.7 (38.74) |
52.0 (38.92) |
−1.64 |
.107 |
0.01 |
−0.25 (−2.33 to 1.84) |
1.0 (0.999-1.0) |
0.02 |
0.04% |
0.05 |
0.11% |
Total |
67.3 (33.06) |
67.4 (33.09) |
−1.47 |
.149 |
0.00 |
−0.1 (−1.06 to 0.86) |
1.0 (1.0-1.0) |
0.02 |
0.03% |
0.05 |
0.07% |
GMFCS |
|
|
|
|
|
|
|
|
|
|
|
I |
91.9 (13.33) |
92.0 (13.36) |
−1.91 |
.070 |
0.01 |
−0.14 (−0.8 to 0.53) |
1.0 (0.999-1.0) |
0.01 |
0.01% |
0.03 |
0.03% |
II |
81.9 (12.22) |
82.0 (12.44) |
−0.20 |
.848 |
0.00 |
−0.05 (−1.51 to 1.41) |
0.998 (0.993-1.0) |
0.15 |
0.19% |
0.43 |
0.52% |
III |
56.1 (17.16) |
56.2 (16.89) |
−0.15 |
.886 |
0.00 |
−0.05 (−1.39 to 1.3) |
0.999 (0.994-1.0) |
0.13 |
0.23% |
0.36 |
0.64% |
IV |
34.4 (11.49) |
34.7 (11.25) |
−1.46 |
.189 |
0.02 |
−0.26 (−1.27 to 0.74) |
0.999 (0.994-1.0) |
0.10 |
0.30% |
0.29 |
0.84% |
V |
5.8 (3.31) |
5.7 (3.41) |
1.57 |
.176 |
0.04 |
0.12 (−0.25 to 0.49) |
0.998 (0.986-1.0) |
0.08 |
1.44% |
0.23 |
4.00% |
Age |
|
|
|
|
|
|
|
|
|
|
|
<8 y |
60.7 (31.41) |
60.8 (31.4) |
−0.52 |
.609 |
0.00 |
−0.05 (−0.97 to 0.87) |
1.0 (1.0-1.0) |
0.02 |
0.03% |
0.05 |
0.08% |
≥8 y |
73.1 (33.98) |
73.3 (34.00) |
−1.48 |
.150 |
0.00 |
−0.15 (−1.14 to 0.85) |
1.0 (1.0-1.0) |
0.02 |
0.03% |
0.05 |
0.07% |
Abbreviations: d Cohen, effect size; GMFCS, Gross Motor Function Classification System; GMFM-88, 88-item Gross Motor Function Measure; ICC (95% CI), intraclass correlation coefficient (95% confidence interval); MD (95% LoA), mean of differences (95% limits of agreement); SD, standard deviation; SEM, standard error of mean; SRD, smallest real difference; t value, T de Student statistics for related samples.
Regarding measurement bias, differences in scores were distributed close to the line of zero differences and no patterns were detected in any of the dimensions or scores analyzed. The average differences were in all cases below 1 point, and the broadest agreement limits corresponding to the D dimension with an interval between approximately ±3 points. In the total score, the mean differences were 0.1 points with agreement limits (95%) between −1.06 and 0.86 points (Figure 2), with an SRD below 1 point.
Fig. 2.: Bland-Altman plots of limits of agreement for intrarater scores (left) and interrater scores (right). The x-axis graphs the average values of each pair of values and the y-axis graphs the differences. The shaded areas mark the limits of the confidence intervals for the error, the upper limit and the lower limit. Scores with a tie were marked with proportional diameter points. This figure is available in color online (
www.pedpt.com).
The reliability and reproducibility of the scores assigned by each assessor for interrater reliability were not affected by the age of the children or their severity level (Table 5; see Supplemental Digital Content 4, available at: https://links.lww.com/PPT/A356, and Supplemental Digital Content 5, available at: https://links.lww.com/PPT/A357).
Relationship Between GMFM-SP-88 Scores and Age and Severity Level
The total score of the GMFM-SP-88 has a significant but moderate correlation with the age of the children (Rpartial = 0.34; R2partial = 12%; P = .02) but a high and negative correlation with the GMFCS level (Rpartial = −0.93; R2partial = 86%; P < .001). These results mean that 86% of the total score of the GMFM-SP-88 may be explained by the GMFCS level and only 12% by the age of the children (Table 6; see Supplemental Digital Content 6, available at: https://links.lww.com/PPT/A358, and Supplemental Digital Content 7, available at: https://links.lww.com/PPT/A359).
TABLE 6 -
Linear Regression Models Between
GMFM-SP Scores, Age, and Severity Level
a
Model |
|
β Coefficient (CI 95%) |
t Value |
P Value |
R
Pearson
|
R2Pearson
|
R
partial
|
R
2
partial
|
R
semipartial
|
R
2
semipartial
|
A. Lying and rolling |
Age |
0.08 (−0.75 to 1.61) |
0.74 |
.47 |
0.15 |
2% |
0.11 |
1% |
0.08 |
1% |
GMFCS level |
−0.69 (−16.73 to −9.01) |
−6.70 |
<.001 |
−0.70 |
49% |
−0.70 |
48% |
−0.69 |
47% |
B. Sitting |
Age |
0.07 (−0.49 to 1.37) |
0.95 |
.35 |
0.15 |
2% |
0.14 |
2% |
0.07 |
0% |
GMFCS level |
−0.87 (−21.8 to −15.68) |
−12.31 |
<.001 |
−0.87 |
76% |
−0.87 |
76% |
−0.86 |
74% |
C. Crawling and kneeling |
Age |
0.2 (0.44 to 2.7) |
2.79 |
.01 |
0.28 |
8% |
0.37 |
14% |
0.20 |
4% |
GMFCS level |
−0.83 (−25.31 to −17.89) |
−11.69 |
<.001 |
−0.85 |
72% |
−0.86 |
74% |
−0.83 |
68% |
D. Standing |
Age |
0.09 (0.03 to 1.41) |
2.09 |
.04 |
0.18 |
3% |
0.29 |
8% |
0.09 |
1% |
GMFCS level |
−0.94 (−27.61 to −23.07) |
−22.47 |
<.001 |
−0.95 |
91% |
−0.96 |
91% |
−0.94 |
88% |
E. Walking, running, jumping |
Age |
0.17 (0.47 to 2.29) |
3.04 |
<.001 |
0.26 |
7% |
0.40 |
16% |
0.17 |
3% |
GMFCS level |
−0.89 (−26.51 to −20.52) |
−15.79 |
<.001 |
−0.91 |
82% |
−0.92 |
84% |
−0.88 |
78% |
GMFM total |
Age |
0.13 (0.18 to 1.63) |
2.52 |
.02 |
0.22 |
5% |
0.34 |
12% |
0.13 |
2% |
GMFCS level |
−0.91 (−22.79 to −18.04) |
−17.27 |
<.001 |
−0.92 |
85% |
-0.93 |
86% |
−0.90 |
82% |
Abbreviations: CI, confidence interval; GMFCS, Gross Motor Function Classification System; GMFM-SP, Spanish version of the Gross Motor Function Measure.
aPartial correlation coefficient allows to know the degree of correlation between the dependent variable (score) and one of the predictor variables, eliminating the effect that the rest of the variables may have on dependent and independent variables.
The correlation between the GMFM-SP-88 and the severity level, when analyzed by dimension, increased from A to D dimensions.
DISCUSSION
The results of this study supported that the total and dimension scores of the GMFM-SP-88 had excellent intrarater reliability with an ICC2,1 = 0.99 and 1.00 for all dimensions and the total scores, respectively. Interrater reliability of the GMFM-SP-88 was excellent for both the total (ICC1,2 = 0.99) and dimension scores (ICC1,2 = 1.00). These results are similar to those offered in the GMFM validation study conducted by Russell et al,1 who analyzed the reliability obtaining the results for interrater reliability (ICC = 0.87-0.99).
The results obtained in the study of the psychometric properties of other cross-cultural adaptations of the GMFM have supported similar results. For example, the Brazilian version reported excellent intra- and interrater reliability (ICC = 0.99, 95% CI 0.98-0.99; ICC = 0.97, 95% CI 0.95-0.98) for total scores14; the Persian version reported an ICC of 0.99 for inter- and intrarater reliability (95% CI 0.96-1.00) for both total and dimension scores15; the Thai version25 obtained excellent intra- and interrater reliability (ICC = 0.99-1.00; ICC = 0.93) for the total scores, results that agree with those obtained by the Korean version who reported excellent inter- and intrarater reliability for both total (ICC = 0.99, 95% CI 0.99-0.99) and dimension scores (ICC = 0.97-0.99, 95% CI 0.96-0.99).12,13
No significant differences were observed between the intra- and interrater reliability results for total and dimensions scores despite factors such as age or the GMFCS level. It should be noted that the lower limit of the CI of the ICC corresponded to level II of the GMFCS in the case of both intra- (0.993) and interrater reliability (0.995).
Age was not related to the severity level, but it was related to the total score. Especially in dimension E, the children who obtained the highest score were older, since this dimension included the most difficult gross motor skills to perform (hops on 1 foot, walk forward on a straight line, or walking backwards, among others). In addition, according to the results offered in other studies, the GMFM-SP-88 discriminates between the different severity levels according to the GMFCS, with a strong correlation between the score in different dimensions and involvement degree.1,7,13 For this reason, it was expected that those children classified in GMFCS levels IV and V would obtain lower total scores, especially in D and E dimensions.
The reliability of a measure refers to its stability over time and its consistency when different raters. Since, inevitably, a certain degree of error is always present in evaluations, it could be considered that reliability is a level of error in the assessment that allows the practical application of the tool in an efficient way.26 Specifically, 3 sources of reliability variation are identified: the assessor, the assessed person, the assessment or the measure itself and the environment (eg, the professional level of experience or in the use of outcome measures, as well as the difficulty of assessment some participants, due to age, type of disability, or degree of collaboration). Two factors reinforced the reliability results of the GMFM-SP-88: the use of video recordings for intra and interrater reliability, which guaranteed the stability in both the assessment conditions and the subject, allowing to stop the recording and view as many times as necessary—the performance of the training workshop in the use of the GMFM-SP-88 that helped control those variation factors of reliability that are dependent on the assessor. In this regard, intra- and interrater reliability results were consistent and homogeneous between the assessors with and without experience in using the GMFM, not being affected by variables such as age or level of involvement.
Regarding the method and the exact amount of training required in the use of the GMFM, criteria have not yet been determined as the Brazilian18 and Korean16,17 cross-cultural adaptation studies of the GMFM shown (in these, assessors received a 12- and 20-hour training session, respectively). The present study was based on the results offered by Russell et al19 that established a total of 6 hours as a reference value for the duration of the training session to significantly improve the level of competence and reliability of the assessors.
Both the training received on the administration and scoring of the items as well as the establishment of the internal validity criterion of a minimum 70% agreement between the scores assigned by the assessor and an expert were fundamental in guaranteeing the homogeneity in the results. In addition, other resources, such as the recording protocol, contributed significantly to reliability results, reaching the maximum consistency and homogeneity.
Regarding the application of the GMFM-SP-88, a reliability coefficient of at least 0.70 may be enough for group comparisons. For clinical decision-making regarding an individual patient, however, an ICC of 0.90 is generally accepted as the minimum, so it can be stated that the GMFM-SP-88 can be used reliably both in clinical practice and in research.
Limitations of the Study
This tool assesses what the child “can do” in a controlled and standardized environment (“capacity”) rather than “what the child does” in the daily environment (“performance”. For this reason, supplementing the GMFM-SP-88 results with those from other measures that assess motor function in their usual environment (school, home, or residential area) should be considered to establish a relationship between the severity of the motor disability and functional autonomy. Likewise, the authors of the GMFM emphasize that the evaluation of the child using the child's orthoses or aids can provide results closer to the motor function patterns of their daily life, offering a more realistic result.1
Due to its evaluative purpose and to reinforce the evidence on its psychometric properties, it is recommended to study test-retest reliability and conduct a longitudinal study to examine the responsiveness of the GMFM-SP-88 to detect changes over time.
Increasing globalization involves both the clinical and scientific fields, thus using outcome measures is considered an essential resource for communicating the effectiveness of an intervention process to the scientific community.
In Spain, pediatric rehabilitation professionals, specifically those who specialized in treating children with CP, have a limited number of assessment tools. The GMFM-SP-88 constitutes a fundamental resource since it will allow these professionals to base their clinical practice on evidence, defining intervention objectives, obtaining objective information on the progress of children with CP, evaluating the efficacy of the therapeutic intervention carried out or observing changes produced after a surgical intervention, among others.
Likewise, it constitutes a highly relevant resource in research since it facilitates obtaining reliable results and applying adequate methodological rigor in studies. Also, the GMFM-SP-88 will support the validity of the conclusions for the subsequent clinical application of the research results.
CONCLUSIONS
The results of this study support the potential use, both in clinical practice and in research, of the GMFM-SP-88 as a reliable tool for the assessment of gross motor function in children and adolescents between 5 months and 16 years old with CP showing an excellent intra- and interrater reliability, being these results similar to those of the original English version. The age of children with CP was related to the total score and especially to dimension E. In addition, it has been shown that the GMFM-SP-88 discriminates between the different levels of the GMFCS with a strong correlation between dimension scores and severity level.
REFERENCES
1. Russell DJ, Rosenbaum PL, Cadman DT, Gowland C, Hardy S, Jarvis S. The Gross Motor Function Measure: a means to evaluate the effects of physical therapy. Dev Med Child Neurol. 1989;31(3):341–352.
2. Ketelaar M, Vermeer A, Helders PJ. Functional motor abilities of children with
cerebral palsy: a systematic literature review of assessment measures. Clin Rehabil. 1998;12(5):369–380.
3. Harvey A, Robin J, Morris ME, Graham HK, Baker R. A systematic review of measures of activity limitation for children with
cerebral palsy. Dev Med Child Neurol. 2008;50(3):190–198.
4. Ferre-Fernández M, Murcia-González MA, Barnuevo Espinosa MD, Ríos-Díaz J. Measures of motor and functional skills for children with
cerebral palsy: a systematic review. Pediatr Phys Ther. 2020;32(1):12–25.
5. Debuse D, Brace H. Outcome measures of activity for children with
cerebral palsy: a systematic review. Pediatr Phys Ther. 2011;23(3):221–231.
6. Adair B, Said CM, Rodda J, Morris ME.
Psychometric properties of functional mobility tools in hereditary spastic paraplegia and other childhood neurological conditions. Dev Med Child Neurol. 2012;54(7):596–605.
7. Beckung E, Carlsson G, Carlsdotter S, Uvebrant P. The natural history of gross motor development in children with
cerebral palsy aged 1 to 15 years. Dev Med Child Neurol. 2007;49(10):751–756.
8. Nordmark E, Hägglund G, Jarnlo GB.
Reliability of the Gross Motor Function Measure in
cerebral palsy. Scand J Rehabil Med. 1997;29(1):25–28.
9. Russell DJ, Avery LM, Rosenbaum PL, Raina PS, Walter SD, Palisano RJ. Improved scaling of the Gross Motor Function Measure for children with
cerebral palsy: evidence of
reliability and validity. Phys Ther. 2000;80(9):873–885.
10. Wei S, Su-Juan W, Yuan-Gui L, Hong Y, Xiu-Juan X, Xiao-Mei S.
Reliability and validity of the
GMFM-66 in 0- to 3-year-old children with
cerebral palsy. Am J Phys Med Rehabil. 2006;85(2):141–147.
11. Brunton LK, Bartlett DJ. Validity and
reliability of 2 abbreviated versions of the Gross Motor Function Measure. Phys Ther. 2011;91(4):577–588.
12. Ko J, Kim M. Inter-rater
Reliability of the K-
GMFM-88 and the GMPM for children with
cerebral palsy. Ann Rehabil Med. 2012;36(2):233–239.
13. Ko J, Kim M.
Reliability and responsiveness of the Gross Motor Function Measure-88 in children with
cerebral palsy. Phys Ther. 2013;93(3):393–400.
14. Almeida KM, Albuquerque KA, Ferreira ML, Aguiar SKB, Mancini MC.
Reliability of the Brazilian Portuguese version of the Gross Motor Function Measure in children with
cerebral palsy. Braz J Phys Ther. 2016;77:751–764.
15. Salehi R, Keshavarz A, Negahban H, et al. Development of the Persian version of Gross Motor Function Measure-88 (
GMFM-88): a study of
reliability. Trends Med Res. 2015;10(3):69–74.
16. Ferre-Fernández MF, González MAM, Díaz JR. Traducción y adaptación transcultural del Gross Motor Function Measure a la población española de niños con parálisis cerebral. Revista de Neurología. 2020;71(5):177–185.
17. Atkinson G, Nevill AM. Statistical methods for assessing measurement error (
reliability) in variables relevant to sports medicine. Sports Med. 1998;26(4):217–238.
18. Prinsen CAC, Mokkink LB, Bouter LM, et al. COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res. 2018;27(5):1147–1157.
19. Russell DJ, Rosenbaum PL, Lane M, et al. Training users in the Gross Motor Function Measure: methodological and practical issues. Phys Ther. 1994;74(7):630–636.
20. McGraw KO, Wong SP. Forming inferences about some intraclass correlation coefficients. Psychol Methods. 1996;1(1):30–46.
21. Weir JP. Quantifying test-retest
reliability using the intraclass correlation coefficient and the SEM. J Strength Cond Res. 2005;19(1):231–240.
22. Leslie P, Mary W. Foundations of Clinical Research: Applications to Practice. Upper Saddle River, NJ: Financial Times/Prentice Hall; 2008.
23. Bland JM, Altman DG. Statistical methods for assessing agreement between 2 methods of clinical measurement. Lancet. 1986;1(8476):307–310.
24. Bablok W, Passing H, Bender R, Schneider B. A general regression procedure for method transformation. Application of linear regression procedures for method comparison studies in clinical chemistry, part III. J Clin Chem Clin Biochem. 1988;26(11):783–790.
25. Laibsirinon S, Earde P, Mahasup N. Interrater
reliability Thai version of Gross Motor Function Classification System (GMFCS) in Thai children with
cerebral palsy. Thai J Phys Ther. 2008;1(suppl 30):26–36.
26. Terwee CB, Mokkink LB, Knol DL, Ostelo RW, Bouter LM, de Vet HC. Rating the methodological quality in systematic reviews of studies on measurement properties: a scoring system for the COSMIN checklist. Qual Life Res. 2012;21(4):651–657.