Martin, Kathy PT, DHS; Hoover, Donald PT, PhD; Wagoner, Erin DPT; Wingler, Teresa DPT; Evans, Tara DPT; O’Brien, Jamie DPT; Zeunik, Julie DPT
INTRODUCTION AND PURPOSE
Gait analysis is often a key component of a physical therapy examination because it is used to identify structural and activity limitations, plan interventions, and assess the effect of those interventions. Gait analysis can also be used to document change over time in a patient’s functional status. The gold standard for gait analysis is instrumented gait analysis that can supply precise quantitative data regarding kinematics and kinetics helpful in more fully describing this form of gross motor activity.1–3 In 1994, Harris and Wertsch4 predicted that instrumented gait analysis technology would become less expensive and therefore more readily available in the clinic. However, more than a decade later, although many low-cost motion analysis systems are widely available, physical therapists (PTs) have not adopted this technology on a widespread basis. Many authors have acknowledged that most clinical settings do not have the resources (time, money, or space) to devote to instrumented gait analysis.1,2,5
Despite these barriers, PTs still have a need to examine and document gait. As a result, numerous scales for observational gait analysis (OGA) have been developed in an attempt to systematically quantify and document gait. One such scale, the Physician’s Rating Scale, was developed for children with spastic cerebral palsy (CP) and originally published in 1993 without any documentation about its development, reliability, or validity.6 Since that time, a number of authors have made revisions to this scale and published data on its reliability.7–9 Other scales have been developed—either for specific populations or in an attempt to improve on the reliability of previous scales.3,10,11 Overall, most OGA scales have failed to achieve the level of consistency of instrumented gait analysis and are limited in their applicability to a specific patient population.12
Despite the fact that these scales are less reliable than instrumented gait analysis, there continues to be a dependence on OGA in the clinical setting. In pediatrics, gait analysis is most often used to assist clinicians with making decisions about orthopedic surgery or medical interventions for spasticity and to assist with assessing the efficacy of physical therapy or orthotic interventions. Dickens and Smith7 also noted that OGA is especially necessary in pediatrics because of the need for frequent reassessment. The timely assessment of the effect of an intervention may be hindered by the time required to obtain an appointment at a gait laboratory facility as well as the time needed to collect, analyze, and interpret the data.
A survey of physiotherapists in the United Kingdom explored this dilemma of choosing between a modestly reliable OGA scale and time-consuming and inconvenient gait laboratories.2 The authors found that 98.6% of pediatric physiotherapists stated that there was a need for gait assessment in clinical practice and that the top 5 things that a gait assessment tool should do include assessing the results of a physical therapy intervention, assisting in the diagnosis of gait abnormality, monitoring patient progress, getting a baseline assessment, and assisting with orthotic prescription. The respondents to this survey indicated that a gait assessment tool should be easy to use, reasonably quick to use, reliable, have clear and easy-to-understand directions, and be valid. The authors concluded that any new gait assessment tool must balance practicalities of use with scientific merit.
A review of the literature on OGA revealed that several scales for clinical use have been developed for children and adults with spasticity,1,3,6,7,11 and for orthopedic impairments,13 but no scale has been reported for children with Down syndrome (DS). Children with DS typically display gait deviations that differ from those often seen in children with spasticity; therefore, existing scales for OGA are not particularly relevant and lack specificity for this patient population.
A number of factors noted in the literature seem to contribute to gait deviations in children with DS. Ligamentous laxity, joint hypermobility, decreased trunk rotation, and poor postural stability are often noted in children with DS.14,15 These impairments contribute to a wide base of support (BOS), a waddling gait pattern, and lack of reciprocal arm swing during gait.15 A longer double-limb support time, decreased hip extension, increased hip external rotation, and increased hip abduction during gait are also often demonstrated by individuals with DS.15 The amplitude of mediolateral weight shifts during gait is usually increased compared with children developing typically, and this has been linked to the wider step width and/or an inability to dampen oscillations of head, arms, and trunk because of muscle weakness and joint laxity, which is common in DS.16
Orthoses are one of the common interventions to address gait deviations in children with DS, and there is some preliminary evidence in the literature that these interventions are successful.17,18 However, the evidence of any intervention, including orthoses, to improve gait is especially limited for this population, and PTs still must rely on gait analysis to assess the efficacy of these interventions, in the clinical setting. Because instrumented gait analysis is impractical for most pediatric clinic settings, an OGA tool for children with DS is needed to improve PTs’ ability to assess and quantify the gait of children with DS.
The purpose of this study was to develop an OGA tool for use in children with DS and to examine its interrater and intrarater reliability. This project was a component of a larger study assessing the efficacy of 2 different styles of orthoses for children with DS. An existing database of digital videos from the larger study was used to examine the newly developed tool.
The research team included 7 members: 1 PT experienced in pediatrics with previous research experience in children with DS, 1 PT experienced in motion analysis and biomechanics, and 5 PT students.
The larger study from which the videos were obtained was conducted by the same research team and had been approved by the University of Indianapolis Institutional Review Board. The parent or guardian of each child with DS had given written informed consent for the child’s participation and for the child to be videotaped for research purposes.
The videos used for this study were chosen at random from the previously mentioned database. Each child with DS was videotaped walking at a self-selected speed across a level tiled floor in 1 of 3 conditions: wearing athletic-type shoes, shoes with foot orthoses, or shoes with supramalleolar orthoses. The children typically wore shorts and T-shirts for the videotaping, but this was not consistent and in some videos the children wore long pants. Four digital camcorders placed around the laboratory allowed both sagittal and frontal plane views of each child’s gait, and 3 to 6 video trials were collected for each child under each of the 3 conditions.
Development of the OGA Instrument
The research team began the process of developing a new OGA tool by reviewing the literature for existing OGA tools and the literature on children with DS. Based on these readings, the researchers identified key elements of a successful OGA tool and critical features of the gait of children with DS that are often addressed by physical therapy. It was decided that the OGA tool should focus on the key phase of gait (stance) where deviations are most problematic for children with DS rather than attempting to develop an all-inclusive description of gait. Based on the findings by Toro et al,2 the research team also wanted the OGA tool to be quick to administer with clear directions. Based on these factors, the researchers chose to include 6 items in the scale: upper extremity position, trunk position, hip position in terminal stance, knee position in mid-stance, foot position at initial contact, and BOS (Appendix 1).
Development and piloting of the new scale occurred in 3 phases. After the initial version of the scale was completed, it was piloted with 3 videos followed by a research team meeting. This led to revisions of the scale, a second pilot test using 15 different videos, and another team meeting that led to further revision and the final version of the scale. The final phase of development of this scale was a training session for all the research team members. These 3 phases are described in more detail below.
The OGA scale was initially developed based on the clinical experience of one of the authors (K.M.) in conjunction with a review of the literature. This initial version of the scale was piloted using 3 video files from a database of video files from a larger study involving children with DS and orthoses. Six members of the research team (excluding the biomechanics expert) scored these pilot video files independently, taking approximately 10 minutes per video. The team then met as a group to discuss the OGA scale, including its ease of use, clarity, and logistics of how to view each video file (time, speed, and repetition). As a result of this discussion, the initial scoring system was expanded from a 3-point system to a 4-point system, and each score was operationally defined for each item in the scale. Our decision to go to a 4-point scale was not based on the need to identify additional abnormalities but instead was based on the realization that some children in our sample had not fully mastered gait and were inconsistent in their gait pattern. In light of motor learning theory, we concluded that the inconsistency we saw may have been a reflection of emerging skill rather than pathological gait. Based on our observations in the pilot videos and research by Palisano et al,19 which suggests that children with DS need a longer time to master skills requiring postural control, we created a scoring category that reflected emerging skill. We wanted to identify a category that represented progress toward a more typical gait pattern but still acknowledged the lack of mastery of gait. In addition, our experience with the first 3 pilot video files and a review of the literature led the research team to decide that each researcher should be allowed to view a video file as many times as necessary with no time limit.
After agreeing on the revision of the scale from the initial pilot with 3 videos, the same 6 members of the research team piloted the revised scale a second time by independently scoring 15 randomly chosen videos from the database. After this second pilot of the scale, the research team again met to discuss the utility of the scale. This discussion led to the creation of operational definitions for the specific gait phase targeted by 4 of the 6 scale items: terminal stance, mid-stance, initial contact, and BOS. These definitions were added to the bottom of the score sheet for quick reference (Appendix). A final 30-minute training session was held for the research team to practice using the new definitions by reviewing several of the 15 videos used in the second pilot test.
To establish the interrater reliability of the scale, 60 videos were randomly selected from the existing database. The videos previously used for piloting and revising the scale were not a part of the sample of 60 used in this phase. The videos were independently scored by the 5 novice members (PT students) of the research team. Each rater was allowed to take as much time as necessary and used slow motion and freeze-frame features to score the video. Each rater used the same computer station and monitor to view the videos, and the videos were viewed in the same order.
Each of the 5 raters used 1 score sheet per video. The 300 total scores (5 raters × 60 videos) were then entered into a spreadsheet. Accuracy of data entry was monitored by having 1 member of the team read the scores while another member typed the score into the database and 2 additional team members observed to note any errors in typing.
Four months later, the 5 novice team members randomly selected 15 of the 60 videos that were used to establish interrater reliability and reexamined the videos twice (2 weeks apart) to determine intrarater reliability of the scale. Again, each examiner was allowed to take as much time as necessary and use slow motion and freeze-frame features to score the videos independently. Each rater used the same computer station and monitor to view the videos, and they were again viewed in the same order.
One score sheet per video was used by each of the 5 examiners. One hundred fifty total scores (5 raters × 15 videos × 2 examinations) were then entered into a spreadsheet. Accuracy of the data entry was monitored as previously described for data entry for interrater scores.
Data were analyzed using Statistical Package for Social Sciences version 15.0 (SPSS., Chicago, IL). On review, it was determined that one of the novice examiners had completed the initial scoring of the 60 videos nearly 3 months after the training session, whereas the other 4 examiners completed this task within a few weeks of the training. Therefore, the results for this 1 examiner were excluded as a means of reducing the possibility of confounding variables that might affect the statistical analysis. Sixty videos were analyzed for interrater reliability results for the 4 remaining novice members of the research team. The intraclass correlation coefficient (ICC) was determined for the overall OGA scores among raters using a 2-way random model based on the absolute agreement, generating a 95% confidence interval. The percentage of overall scores in the absolute agreement and the percentage of difference by 1, 2, 3, or 4 points was also calculated. In addition, ICCs were determined for each of the 6 individual items on the OGA scale to determine which parts of the scale were more reliable. Fifteen videos that were reviewed twice by the remaining 4 novice members were analyzed for intrarater reliability. The ICCs were determined for each individual member using a 2-way random effects model based on absolute agreement, generating a 95% confidence interval. Levels of reliability were determined based on the following scale: fair, ≥0.5; good, ≥0.6; and excellent, >0.75.20
The ICC for the overall OGA score was 0.663 (95% confidence interval: 0.553–0.762). Overall, these results demonstrate good agreement20 among the 4 examiners for the total score in this OGA tool. The examiners achieved complete agreement for 35.56% of the trials analyzed. The percentage of analyzed trials in which there was a variation in the overall score was as follows: difference by 1 point for 43.3% of trials, by 2 points for 13.06% of trials, by 3 points for 7.5% of trials, and by 4 points for 0.56% of trials. No scores differed by >4 points. Specific item interrater reliability revealed the upper extremity had the highest ICC across all the observers (1.00) followed by the trunk position (0.799). BOS had the lowest interrater ICC, at 0.499 (Table 1). The interrater standard error of measurement for absolute reliability was 1.89. Individual intrarater reliability revealed a range of 0.616–0.877 (Table 2).
Instrumented gait analysis remains the gold standard for evaluation and assessment of gait dysfunction. However, cost, time, and access to instrumented gait analysis laboratories prevent routine clinical use of such tools. Therefore, development of reliable and accessible OGA tools is necessary for the wide variety of patient populations seen in clinical practice. Although several tools for evaluating children with CP have been developed and validated,1,3,7,21,22 currently no tools exist for children with DS. Children with DS, who typically present with low muscle tone, exhibit gait deviations different from those of children with CP, who often present with increased muscle tone. Thus, different assessment tools are warranted.
This OGA tool is different not only in its target population but also in its scoring system. Many OGA scales use a 3-point scoring system to identify normal, mildly abnormal, and markedly abnormal gait,1,5,23 or a 4-point scale to identify normal, mild deviation, moderate deviation, and severe deviation.11 Our 4-point system was designed to acknowledge emerging skill and reflect a developmental stage versus scoring a less mature gait pattern as abnormal. This approach was chosen to reflect the available evidence on development of gross motor skills in children with DS19 and is unique to this scale.
The reliability of some previously published OGA scales was established by having the evaluators view and score just 1 gait cycle chosen by the researchers.7,12,21 Although these authors used this strategy to more carefully control the variables in their studies, we did not think this was an appropriate strategy for this study. Our participants were young children with DS who were still acquiring and refining gross motor skills. Scoring just a single gait cycle may not have given an accurate picture of the child’s actual skill. Similarly, clinicians in the field often make multiple observations of a child’s gait to assess it qualitatively. Thus, we observed 3–5 gait cycles per participant depending on step length and length of time that the child was in the camera’s field of view. Rather than averaging the scores for each individual gait cycle, we chose to expand our scoring to account for inconsistent or emerging skill. We included an operational definition for this scoring category, “inconsistently normal,” on the score sheet for easy reference (Appendix).
The results of this study indicated good overall interrater reliability (0.663), which is comparable with the results of other OGA studies.7–10 In addition, examiners in this study showed absolute agreement for 35.56% of the scores analyzed and 78.86% of the scores were found to vary by 1 point or less. The absolute agreement findings in this study are not as good as the results of other OGA studies, which found absolute agreement of 58%–77% of the time.3,5,21 Even though this study found good interrater reliability but less absolute agreement, these comparisons with other published studies must be made cautiously because previous studies were on OGA scales for children with CP who typically present with gait deviations different from those of children with DS.
The standard error of the measurement for the interrater reliability data in this study was 1.89. This indicates that the true measure of each child’s gait was within ±3.7 of their reported score. Because the OGA scale used in this study had a maximum score of 18, this means that a child’s score would have to change by at least 20.5% for the change to be attributed to something other than possible measurement error. Although a clinically important change has not yet been established for this OGA tool for DS, this amount of measurement error is comparable to the study published by McGinley et al12 in which their estimate of measurement error was 22.7% of the total score of their OGA tool. However, our results differed from those of a study by Krebs et al5 that found that the magnitude of change only needed to be greater than 10% to be 95% confident that a real change had occurred.
The item for assessment of the upper extremities was found to have the highest level of interrater reliability among observers (1.00). This component of the observational gait assessment tool was easier to observe because of less clothing.1,24 In addition, the videos used in this study were originally gathered for a study (Kathy Martin, PT, DHS, unpublished data, 2007), which had an inclusion criterion that required children to walk independently for at least 30 m without support. Therefore, all the children showed a mature enough gait pattern to include arm swing, likely contributing to the high reliability of this item. Reliability of this item may be different with children with DS who have less walking experience or skill.
Assessment of trunk position was also found to have high interrater reliability (0.799). This excellent reliability was attributed to the ease of visibility of trunk position relative to vertical and the availability of a sagittal camera view to assess trunk position. Potential causes of relatively lower reliability include alterations in the gait speed (trunk lean when increasing speed) and increased anterior pelvic tilt in some individuals, which may have obscured the visual reference for vertical alignment.
Although the item for assessing hip position had good interrater reliability (0.703), it presented greater variability compared with the OGA analysis of the trunk. The hip joint was often obscured by soft tissue and clothing. In addition, the hip lacked a clear point of reference because of the often less than ideal alignment of the pelvis. A study performed by Toro et al21 cited similar limitations in analyzing the hip position in children with CP. This study demonstrates that good reliability can be achieved even though the hip presents with greater soft-tissue and decreased visibility of the bony landmarks. These are barriers that are also found in the clinical practice.
The items for assessing the position of the knee in mid-stance and foot position at initial contact were found to have very similar ICCs for interrater reliability (knee = 0.611, foot = 0.593). Reliability of the assessment of the positions of these 2 joints at key phases of the gait cycle may have been affected by many of the same issues cited for the hip, including clothing or soft tissue obscuring landmarks. In addition, there were occasions when we did not have a true sagittal view, such as when the child did not walk in a straight line across the laboratory and the precise moment of mid-stance or initial contact did not occur directly in front of the camera. A few children also wore shoes that were similar in color to some of the floor tiles in the laboratory. Without a sharp color contrast, foot position was at times difficult to ascertain in video analysis. Our results differ from those of 2 previously published studies that noted that the knee joint was the most reliably evaluated.7,21 The findings for the foot position at initial contact in our study are similar to the findings of Dickens and Smith7 but not quite as high as the reliability values reported by Toro et al.21
The interrater reliability of BOS was found to be the lowest at 0.499, which is considered fair.20 Whereas this variable demonstrated the lowest interexaminer reliability in this study, a study by Eastlack et al23 found an ICC of only 0.23 for step width. The lower reliability regarding BOS may be partly attributed to the lack of pure frontal or posterior camera views in the digital videos used in this study. In addition, 12 inch floor tiles were used as a reference point for determining the width of BOS, thus making it difficult to precisely estimate the magnitude of BOS. Normal BOS was defined as 3 to 4 inch for this item in the OGA instrument used in this study. Therefore, given the frame of reference used, examiners had to estimate the width of the child’s BOS. Another factor may have been that the children in our study often demonstrated a great degree of variability in their gait such as the inability to ambulate in a straight line, likely because of decreased postural control15 or distractions in the testing environment. These changes in direction required examiners to use their clinical judgment in determining whether an individual step was truly inconsistent step width or a directional change on the part of the child. Finally, the videos in this study often consisted of only 3 to 5 gait cycles. Given our operational definition of inconsistently normal (observed 50% to 75% of the time), a change in step width in a single gait cycle could have a large effect on the child’s score. However, these issues are again likely to be representative of typical clinical practice.
Intrarater reliability for this study was found to be 0.616–0.877. Overall, intrarater reliability is typically higher in many studies compared with overall interrater reliability. The study by Toro et al,21 which examined the Salford Gait Tool for children with CP cited good intrarater reliability but did not report ICC statistics, thus a direct comparison is not possible.
Video recording was used to decrease the variability with observation. Use of videotaped gait trials allowed breakdown of complex events that are not visible to the human eye in real-time observation.2 This method of observation also allowed raters to view 1 video of a single series of 3 to 5 gait cycles, thus decreasing a potential source of variation by eliminating gait changes secondary to fatigue of the participants.2 Each researcher in this study viewed each video in the same order, thus limiting the possible training effect. In addition, each observer was allowed to review the videos as necessary, including stopping and rewinding, to fully observe each component. Although allowing each observer to take as much time as necessary may have decreased the standardization in this study, it more accurately reflects clinical practice and thus may make these results more generalizable to a clinical setting. Some clinicians admittedly may not have the ability to videotape and then review the tape to assess gait in the clinic, and this scenario would definitely limit the generalizability of our findings for those clinicians. However, video cameras are likely to be more available than a gait laboratory for most clinicians and thus the use of a video recording seems like a reasonable compromise for the clinical assessment of gait. Thus, our study’s use of video may make the results more relevant for many clinicians.
This study used second-year Doctor of Physical Therapy students, whom we deemed novice clinicians with emerging qualitative motion analysis skills. This approach arguably provides a more realistic representation of clinical practice than would be the case had this study involved expert clinicians because not all clinician are experts in gait analysis.21 This issue of experience with gait analysis bears consideration when generalizing the findings of this study to clinical practice. Four other studies have examined this difference in reliability for OGA between novice and experienced clinicians.5,13,24,25 One study performed by Krebs et al5 found good reliability (ICC = 0.73) among expert clinicians. In a study of gait analysis for patients with orthopedic impairments, Brunnekreef et al13 did not find a significant difference between experienced and inexperience raters but did note that experts were more reliable (ICC = 0.54 for experts, 0.42 for experienced, and 0.40 for inexperienced). Their conclusion was that some experience did not matter but that a lot of experience did. Similarly, a study involving 6 medical students who used the Edinburgh Visual Gait Score found that the novice medical students were reliable but less accurate in using the scale compared with the experienced raters who participated in the initial development of the scale.25 In contrast, another study involving patients with multiple sclerosis demonstrated that results from PT students were similar to master clinicians in recognizing gait deviations using the Ranchos Los Amigos Gait Analysis checklist.24 Although the literature may be somewhat mixed, our results do seem similar to most of the published results.
This study was strengthened by the final phase of the pilot testing, which was a training session for all researchers with the final version of the scale. In a study performed by Eastlack et al,23 the authors concluded that a training session would have been helpful and likely would have improved their results. During the training session in this study, examiners discussed and agreed on discrepancies in assessment of the same video. Also, examiners were informed of operational definitions of normal gait kinematics that served to diminish variability based on individual perceptions of “normal” gait kinematics.23 Because our training session was short (30 minutes), this observational gait assessment tool can be both cost- and time-effective for clinical use.
Several factors affected the outcomes of this study. Young children with DS were observed who, due to emerging skills, had inherently variable gait.14,15 This is a factor that may affect the outcomes in any study that uses young children with developmental delays. In addition, the cognitive impairment seen in children with DS limited many of the participants’ ability to follow commands such as “walk slowly” or “walk in a straight line.” Children were often inclined to run, change direction, or were distracted, which altered the view of various body parts. For example, when a child changed directions, this could cause internal rotation of 1 limb, making it difficult for examiners to assess the hip extension and/or BOS. As a result, it was often difficult to obtain consistent, clear video trials with ideal views of the participants’ gait.
Examiner consistency could be improved by using markers over the bony landmarks to allow better assessment of joint position throughout the gait cycle. Markers were not used in this study, partly because the original videos were collected for a different purpose that did not require markers for precision. In addition, it was likely that the children would have been distracted by the markers, which may have altered gait patterns. Markers are not typically used in a clinical setting; thus, our results may be more realistic for clinical practice. In addition, children should be required to wear tighter fitting or standardized clothing, including shorts, thus decreasing errors caused by obscured joint position. When using a scale that considers emerging skill (inconsistency), additional gait cycles should be visualized to account for the variability in a child’s gait.
Finally, the use of a video camera could be considered either a strength or a limitation, depending on whether this option is available to a clinician. The use of video allows repetitive viewing without patient fatigue. We attempted to use video data that would be reasonably easy to collect in a clinical setting, but we did have 4 cameras available versus using just 1 as is likely in the clinic. Ideally, even if only 1 camera is used, multiple views, including frontal, sagittal, and posterior view, would allow examiners to better visualize various gait components. Some clinicians may view this need to tape multiple views as a barrier to implementing this type of assessment in the clinic.
Because of the ease of use and the accessibility of OGA, it is used regularly and preferred by PTs.1–3,12 An OGA tool such as this provides a more objective means to measure what is typically a subjective assessment. Currently, there are no other noninstrumented tools available to assess gait in children with DS. The assessment tool that we developed demonstrated good20 interrater and intrarater reliability, comparable to that of previously published OGA tools for other patient populations. However, the validity of this tool must be determined by comparing it with the gold standard of instrumented gait analysis. Sensitivity, specificity, and clinically meaningful change in score also need to be evaluated. In the research setting, this tool was simple to use; however, further investigation must be conducted to determine the ease of use and applicability in the clinical setting.
The research team thank Margaret Finley, PT, PhD and Clyde Killian, PT, PhD, for their assistance with data analysis and interpretation. In addition, the authors express their appreciation to the children and families who participated in the larger orthotic intervention study.
© 2009 Lippincott Williams & Wilkins, Inc.