Ruck-Gibis, Joanne MSc A, PT; Plotkin, Horacio MD; Hanley, James PhD; Wood-Dauphinee, Sharon PhD, PT
Osteogenesis imperfecta (OI) is a genetic disease of connective tissue with an incidence of 6.5 per 100,000 and a prevalence of one per 10,000 individuals. 1 Clinical features include bone fragility, reduced bone mineral density (BMD), fractures, progressive long bone and vertebral deformities. 2–5 Other clinical features are joint hypermobility and muscle weakness. 6 Developmental milestones may be delayed or arrested in more severely affected children. 7–9
The most widely accepted classification, developed by Sillence, Senn, and Danks, 10 is based on modes of inheritance, radiological and clinical findings. 5–11 Type I is the mildest form of OI, whereas type II is lethal in the perinatal period. Type III is the most severe non-lethal form, with frequent fractures, marked deformity and stature below the third percentile. 12,13 Type IV includes a heterogeneous group of individuals, who do not fit the type I or III profiles.
The goal of management of these children by orthopedic surgeons is the prevention of deformities and fractures, the correction of deformities through intramedullary fixation and ultimately the improvement of function, in particular ambulation. 14 Different treatments (vitamin D, calcitonin, anabolic steroids, and fluorides) have been unsuccessful in the treatment of this condition. 15 However, patients treated with disodium pamidronate have shown increases in BMD, subjective improvements in well being, chronic pain relief from micro fractures, and some children have also shown improved mobility and ambulation skills. 16,17
The goal of physical therapy management of these children with multiple fractures, skeletal deformities, weakness, joint hypermobility, and gross motor delay is maximization of functional independence. 6,11,14 Intervention strategies are based on knowledge of the child’s achievement of developmental milestones and focus on improving muscle strength, muscle stabilization of the joints as well as promoting functional ability. 8
Children with OI, participating in comprehensive rehabilitation programs that combine physical therapy, lower extremity bracing and, when indicated, orthopedic surgery, have demonstrated high levels of function. Some children gained the ability to ambulate, which they might not have achieved without comprehensive rehabilitation. 18 The effect of physical therapy alone has yet to be demonstrated. One recent study documented change in sitting, standing, and ambulation over a 14-month period. 19
Measurement of Gross Motor Function in OI
A few studies have examined the relationship between the achievement of developmental function and eventual mobility status. 8,20 In infants with types III or IV presentation, developmental milestones are delayed and the order of achievement of milestones differs from that normally expected. 8,21–23 Static milestones develop at an earlier stage than dynamic milestones especially in children with type III OI. For example, sitting independently occurs before lifting the head in prone. The fragility of bones of the upper extremity makes transition skills very precarious; relative macrocephaly in comparison to the length of the body may be a contributing factor. Transition skills are, therefore, slow to develop. 22
One study, 23 which used the Peabody Developmental Motor Scale, standardized until 83 months of age, to describe the gross motor development of children with OI was found. Only a few studies of OI have incorporated selected aspects of gross motor development. Haley described a disability profile using the Pediatric Evaluation of Disability Inventory (PEDI), a measure of self-care, caregiver assistance, mobility, and social function, administered by structured interview and based on parent report. 24 Engelbert et al 21 reported that Dutch children under the age of seven and a half years with types III and IV OI scored more than two standard deviations below the median in the mobility domain.
One limitation of administering some standardized scales to children with bone fragility is the amount of risk involved. 25–27 Examples of risky items include: walking on a balance beam five centimeters wide, jumping hurdles, jumping in the air while turning, push-ups, and skipping. Another inadequacy of some of these measures for this population is the failure to take into consideration the use of adaptive aids such as orthoses, canes, crutches, and walkers. Tests should be able to be used to reflect improvements over time while including the amount of external support required for gross motor function. Further, measures that require high levels of gross motor function and exclude walking aids, cannot be used to detect small increments of change in moderately and severely affected children with OI. Children with types III or IV presentations often use crutches, canes, or walkers to ambulate.
The Gross Motor Function Measure (GMFM) was constructed specifically for the purpose of evaluating change in gross motor function in children with developmental disabilities, in particular children with cerebral palsy (CP). 28 This instrument consists of 88 items which have been grouped into the following dimensions: lying and rolling, sitting, crawling and kneeling, standing and walking, running and jumping. Each item is scored on a four-point scale. Values are assigned from zero to three, depending upon the percentage of acquisition: zero: does not initiate, one: initiates (<10% of the task), two: partially completes (10% to <100% of the task), and three: completes the task. Each dimension has a different number of items, therefore a different maximum score. The raw score in each dimension is converted into a percentage of the maximum per dimension. Each dimension is equally weighted and the total score is calculated by summing the percentages of each dimension and dividing by five. The validation sample, included children aged five to 60 months, 111 with CP, 25 with head injury, and 34 preschool children who were not disabled. Of the 111 children with CP, 88 had spastic-type CP; 23 had nonspastic-type CP. Two of 23 with nonspastic CP were classified as hypotonic.
The intrarater and interrater reliabilities for repeated administration of the measure were estimated by intra class correlations (ICCs). An estimate of 0.75 was considered acceptable for all reliability coefficients. The interrater reliabilities ranged from 0.87 to 0.99 across the five dimensions and 0.99 for the total score. The intrarater reliabilities ranged from 0.92 to 0.99 across the dimensions and 0.99 for the total score. Validity was assessed by correlating scale change over a period of four to six months on the GMFM with observer judgments of parents (r = 0.54), therapists (r = 0.65) and blinded assessors (r = 0.82). 28,29 In another study, the interrater reliability estimate of the total GMFM for children with Down syndrome was greater than 0.90 (ICC). However, the lying, rolling, crawling, and kneeling dimensions showed more variability than the others did (0.73 and 0.88). 30 Responsiveness of the GMFM for children with CP was adequately demonstrated in three studies, which assessed children with CP after a rhizotomy, a fitness program, and intensive physical therapy. 31,32
Streiner and Norman 33 stated that the psychometric properties of a measure are intimately linked to the specific population to which one wants to apply the measure. It cannot be generalized to other populations without being tested. This measure was developed for children with CP and has only been tested for children with neurological conditions. Although children with type III or IV OI, particularly as infants, are weak and exhibit significant motor delays, as do children with CP, the major impairment in OI is abnormal collagen. 6–9 Intra- and interrater reliability of this measure have never been tested on a population with primarily orthopedic dysfunction.
One advantage of using the GMFM for this population is the inclusion of orthoses and walking aids in the calculation of the score. Every dimension may be scored with or without aids. Progress over time can be detected, as the child requires less bracing and support for ambulation to accomplish the same tasks. The GMFM also involves less risk for fractures than do some of the norm-referenced measures of gross motor function. Partial scores are given for initiation of the movement; thus, the risk is reduced. For children with CP, this instrument is appropriate from infancy through adolescence. Our long-term goal for the GMFM is that we will be able to measure change over time in children from infancy to 18 years of age with moderate and severe OI.
The first objective of this study was to evaluate the interrater reliability of the dimensions and total score of the GMFM when administered to children (infancy to 18 years of age) diagnosed with types I, III, or IV OI. Interrater reliability determines the extent to which consistent scores are obtained by repeated measures of the same patient. The second objective was to evaluate the intrarater reliability of the measure. Intrarater reliability assesses the consistency of repeated measures by the same rater over a short time period.
Study Population: Raters
The population of raters included all five physical therapists employed at the Shriners Hospital for Children in Montreal. Because of the small size of the physical therapy department, the first author was included in the group of raters. The first author has 21 years of clinical experience in pediatrics. The remaining therapists’ pediatric physical therapy experience ranged from five to eight and a half years, the mean being six and three-quarter years. Four therapists graduated from McGill University, the other from the University of Montreal. All therapists had used the GMFM with children who have CP.
Training using the GMFM.
Three of the five therapists, including the first author, attended a GMFM workshop taught by the developers of the measure. As part of the workshop, the ability of participants to score a sample of items from the GMFM from videotaped performances of children with CP was assessed. The three therapists attained the required level of agreement to be certified (>0.80 weighted kappa). The remaining two therapists were instructed by the first author in the use of the measure with children diagnosed with CP. The entire group was criterion tested in October 1998 for use of the GMFM with children with CP. A videotape of several items, provided by the McMaster University GMFM group, was scored. The levels of agreement with the criterion videotape ranged from a 0.86 to a 0.97 weighted kappa.
Before commencement of this study, the four therapists attended another training session on the administration and scoring of the instrument for children with OI. Each therapist administered the GMFM to a few children. The first author developed a sheet of instructions to assist in the scoring of the GMFM. Each therapist individually scored videotapes of GMFM evaluations of two children, with types III and IV OI. The group then discussed the videotapes. These two children were not included in the study.
Study Population: Patients
The participants in this study were selected from the 150 patients diagnosed with OI who were followed at the Shriners Hospital for Children in Montreal and were part of a cyclical pamidronate protocol. The catchment area includes Canada and the New England states; however, patients from around the globe are now being treated with the pamidronate therapy at the Hospital. Although those from outside the province tend to have more serious clinical presentations, they still represented the spectrum of clinical manifestations of the condition.
Children who had radiographically confirmed fractures within six weeks before the evaluations were excluded. At the time of the evaluation, any child who was in traction or immobilized in splint, back slab, or cast after surgery was excluded also. The GMFM could not be administered safely under these conditions.
A sample of 19 children was selected according to age and severity of presentation of the condition. These children were representative of the spectrum of patients treated at the Hospital. Informed consent was obtained from a parent of each child before the videotaped sessions. The hospital review board approved the protocol. The clinical profiles described in Table 1 illustrate the heterogeneity of the sample. All the children were on cyclical pami-dronate therapy. The nine boys and 10 girls ranged in age from eight months to 17 years 11 months. The mean age was 7.89 years and the median was six years. The participant group included two children with type I, nine children with type III, and eight children with type IV OI.
The author administered and scored the GMFM for the 19 patients between June 1, 1998 and March 29,1999. All evaluations were videotaped by one of two audiovisual technicians. A minimum of six weeks later, from February 1 to May 20, 1999, the videotapes were scored individually by the four other physiotherapists employed at the Shriners Hospital. The author also rescored the evaluations from the videotape. This timeframe reduced the possibility of the author’s memory of the first score. The physical therapists were asked not to discuss their scores. Over the several-week period of scoring the videotapes, the therapists realized that some items were more difficult to rate than others and the author referred them back to the GMFM manual.
For each item, the best of three tries was used as required by the GMFM manual. The therapists tallied the data, including the score for each dimension and total score as previously described. The participants were given the opportunity to attempt all items of the GMFM evaluation that were deemed safe. However, any tasks, which were perceived by the family, child, or physical therapist to put the child at risk for fracture, were not attempted.
A number of studies have used videotapes of patients to permit multiple raters to observe the same performance. 29,34 According to Gross and Conrad, 35 videotaping permits less biased estimates of reliability. It also facilitates scheduling of patient evaluations when the interrater reliability of several raters is involved. Organizing several therapists to evaluate one patient at the same time was not feasible in our clinical setting. In the present study, the child was videotaped from angles that best permitted complete viewing of the specific task. Occasionally, the best view was not obtained but it was consistent across therapists. One of the disadvantages of videotaping the evaluations is that the determination of reliability could be overly optimistic because the viewing angle was consistent and there was only one rater interacting with the child.
Intraclass correlation coefficients (ICCs) were calculated. The ICC is the measure of agreement for ordinal data that meet the assumption for item summation. It is the ratio of the variance between patients over the sum of variance between patients and the variance between raters and error. Perfect concordance without any variance in scores will yield a value of one. Reliability coefficients of 0.80 and greater are considered high for group estimates. However, when a measure is to be used in clinical decision making for an individual patient, criteria that are more stringent are recommended. 36,37 An ICC of 0.90 for a total score is generally accepted to be the minimum required for clinical decision-making.
The interrater reliabilities were calculated with the ICC derived by a two way random effects analysis of variance as described by Shrout and Fleiss. 38 Model 2.1 is recommended for use in interrater reliability studies, where all subjects are evaluated by each of a number of raters, who are considered representative of a larger population of similar raters. It was our opinion that the physical therapists at the Shriners Hospital were representative of all physical therapists with experience working in pediatric settings who are familiar with the measure. An approximate one-sided 95% confidence interval (random raters) for the ICC was calculated using equation 1.68 according to Fleiss. 39
For intrarater reliability, Model 3 derived from a one-way random effect analysis of variance is recommended. It is the estimate of the ratio of the difference of patient variance and the error variance over the sum of the patient and error variance. An estimate for the ICC intrarater reliability was calculated using equation 1.2 of Fleiss. 41
Cohen’s kappa is a chance-corrected measure of agreement, which describes interrater agreement beyond what is expected by chance alone, as reflected by crude agreement. 40 Perfect agreement is indicated by a value of one for kappa and zero for chance agreement alone. According to Landis and Koch, 41 values greater than 0.75 are usually considered to represent excellent agreement between raters. Values between 0.40 and 0.75 represent moderate agreement and those below 0.40 represent poor agreement. Simple kappa was estimated for each item demonstrating higher levels of disagreement than the majority.
Figures 1 through 6 illustrate the degree of agreement among raters for the entire sample. The figures indicate the participants’ scores for the specific dimension or total score. The first five recordings are the scores given by raters one through five for the first child. The next five scores are for the second child in the same order of raters. This convention is used for all 19 children in the same order as listed in Table 1. None of the total or individual dimension scores overlap with each other. The interrater and Intrarater reliability results are shown in Table 2. We established a priori that the ICC should be at least 0.90 for the total score and 0.85 for each dimension.
The interrater reliabilities estimated by ICCs ranged from 0.98 to 0.99 across the five dimensions and the total score. The lower confidence limits ranged from 0.97 to 0.99. The lowest ICC and confidence limits were noted in the first dimension: lying and rolling. The intrarater reliabilities of the author estimated with a live evaluation compared with videotape of the evaluation were consistently 0.99 for all dimensions and the total score.
Despite these high ICC values, it was apparent by visual inspection that a few items demonstrated more disagreement than the majority: items 3, 4, and 19. Simple Kappa, computed for items 3, 4, and 19 ranged from 0.55 to 0.91. The confidence limits were 0.39 to 1.00. One pair of raters had perfect agreement for item 4. In summary, despite disagreement on a few items, both intra- and interrater reliability estimates of pediatric physical therapists trained in the administration of the GMFM for use with children diagnosed with OI were high.
Our results with a group of children diagnosed with OI are higher than those estimated for children with CP. Although the intra- and interrater reliability results obtained by Russell and colleagues 28 were very similar to those found in this study, there were a few differences. Russell’s group estimated an interrater reliability ICC of 0.87 for dimension A, although our result was 0.98. Their reliability for the sitting dimension was 0.92, whereas ours was 0.99; for the standing dimension it was 0.92, whereas ours was 0.99. There are several possible explanations as to why we achieved higher estimates. Children with OI do not have cognitive or motor-planning deficits unlike some children with CP. Repeated measures of the GMFM in a population of children with CP may have resulted in lower intrarater reliability scores, as the child’s gross motor function may have been altered by these factors. In our study, the consistency of scoring between raters may have been facilitated by the children’s ability to comprehend and achieve the tasks.
Bjornson and colleagues 42 evaluated the validity of the GMFM for 37 children diagnosed with spastic diplegia, who participated in a randomized clinical trial that addressed the efficacy of selective dorsal rhizotomy. As part of their study, they estimated the interrater reliability of the six evaluators and the lead physical therapist. Interrater reliability was monitored quarterly from videotapes using the lead physical therapist responsible for training and supervision as the gold standard. They maintained more than a 0.90 point by point agreement. The ICCs ranged from 0.80 to 1.00. Information regarding dimension scores is not available in the literature nor is the amount of rater training in the administration and scoring of the measure. In our study, all the physical therapists passed criterion testing of the measure for children with CP.
Difficulties with measurement consistency occurred also when the GMFM was used for children with Down syndrome. Russell and colleagues 30 conducted a psychometric study of the GMFM with this population. Two pediatric physical therapists assessed a group of 22 children on two occasions separated by a maximum of two weeks. The assessor and observer roles were determined randomly. The interrater reliabilities measured by ICCs for the individual dimensions were estimated at: 0.73 for lying and rolling, 0.97 for sitting, 0.88 for crawling and kneeling, 0.98 for standing, 0.96 for walking, running and jumping, and 0.96 for the total score. The test-retest ICCs were 0.62 for lying and rolling, 0.96 for sitting, 0.83 for crawling and kneeling, 0.98 for standing, 0.95 for walking, running and jumping and 0.95 for the total score. As stated by the authors, children with Down syndrome who have progressed developmentally beyond a certain dimension, resist performing lower level skills as required by the GMFM. Thus, the interrater reliabilities may be lower in dimensions A and C, since the children would have progressed to the walking stage. In addition, their ability to follow instructions may be limited by their diminished cognitive capacity. If the children with Down syndrome did not perform consistently during the two GMFM tests then their test-retest scores would be lower than our intrarater reliability ICCs. As well, evaluating children with limited cooperation may be more difficult than evaluating children who are cooperative. This could be said for both children with CP and Down syndrome.
Another possible reason for the high intra and interrater reliability is the mode of administration of the test. Specifically, we used videotaped evaluations. A number of issues were taken into consideration when determining the methodology for estimating the intra- and interrater reliability of the GMFM for children with OI. Because many of the patients are from out of province or even out of country, access to them for a second evaluation was very difficult. Many patients return to the hospital every four months for a period of three days, for the cyclical intravenous pamidronate treatment. Test-retest over a three-day period is too short a period to avoid recall of scores. Young children have short attention spans and are easily distracted; therefore, being assessed in front of many raters could result in an inaccurate gross motor score. Therefore, the evaluations were taped. Furthermore, it was not feasible to liberate five physical therapists for the 19 evaluations during the working day. Only one therapist was made available for the taping. Because of the mode of administration, the physical therapists viewed the videotape from exactly the same angle. Even when the camera view did not provide the best perspective, it was consistent for all the viewers. In a clinical setting, the therapists would observe the evaluation from slightly different positions in the room. Their eyes might focus on different aspects of the task. The videotaping technique standardized the evaluation, which is not possible in a live setting.
In a clinical situation, a variety of therapists would administer the GMFM. It is probable that the interrater reliability would have not been as consistent had several examiners been involved in the study. In addition, since the evaluation occurred only once, varying degrees of noncompliance did not affect the score and thus increase the variation of the intrarater reliability. The single evaluation may have resulted in an overestimate of the intrarater reliability.
The high interrater reliability may also be explained because all the therapists have five to eight years of experience with administration of the GMFM for children with CP. Their knowledge of normal gross motor development was considerable. Moreover, they received additional training with this measure for both children with CP and children with OI. The therapists had weighted kappa statistics of between 0.86 and 0.97 on criterion testing for children with CP before the additional instruction for this study.
Another possible explanation for the high interrater reliability was the heterogeneity of the sample population. The lowest GMFM score was 8.66% and the highest was 98.62%. There was no overlap of scores among the raters. Portney and Watkins 43 state that reliability is based on the proportion of the total observed variance that is attributable to error. Therefore, for a given amount of error variance, as the total variance increases, the error component accounts for a smaller portion of it. The greater the range of scores, the smaller is the variance due to error and the higher is the interrater reliability. In summary, we had a heterogeneous sample, which may have contributed to the excellent reliability results.
Children scored a zero in some dimensions. Some children were either too young or lacked the strength and balance to accomplish the tasks of the dimension. Other children did not attempt the crawling dimension due to fragility of their upper extremities or marked bowing of their tibias, which made crawling uncomfortable. The consistency of scoring zero for some dimensions may also have resulted in an exceptionally high estimate of intra- and interrater reliability.
Despite the very positive results, the physiotherapists did however; have some difficulty scoring a few items as demonstrated by the kappa computations (Table 3). Items 3 and 19 produced the most disagreement. A possible reason for the disagreement in item 3 (supine: lifts head 45 degrees) is the inability to detect from the videotape the active contraction of the neck flexors. Many children with OI are weak and have relatively large heads; therefore, active flexion of the neck is difficult. Consequently, they elevate their shoulders and passively lift their heads by pushing themselves up with their arms. Distinguishing between the true neck flexion and the compensatory movements from the videotape was not always accomplished.
The instructions for item 19 are to roll over to one side from the supine position, then attain sitting. Children with OI frequently sit up in the same manner as described in the previous paragraph. When asked to roll over, they initiate rolling to one side then sit up. Other children with a history of upper extremity fractures avoid prolonged weight bearing on their arms. Again, they avoid rolling completely to side lying and pushing up from this position. The physical therapists were uncertain whether the participants accomplished the task in the manner required by this item. Consequently, there some degree of disagreement as measured by kappa (0.712–0.802).
Items 4 and 5 require the children to flex their hips and knees through full range. Although children with OI rarely have contractures, they often present with femoral and tibial bowing. The therapists demonstrated some disagreement, as it appeared difficult to observe whether the child had achieved sufficient range to fulfill the requirements of the task.
Despite these minor difficulties, excellent results were obtained from an intra and interrater reliability study of the GMFM conducted by physical therapists at the Shriners Hospital for children with OI. Our results are higher than those estimated for children with CP or Down syndrome, perhaps because children with OI have normal cognitive function and are able to carefully follow instructions. Videotaping, our mode of administration of the test also standardized the evaluations and may have added agreement. Moreover, the physical therapists had all been trained in use of the GMFM and had passed criterion testing prior to the start of the study. Finally, the heterogeneity of the sample reduced the amount of variance due to error and thus increased the interrater reliability. Despite the excellent results there were a few items where there was more rater disagreement than for the majority of the items.
The main limitation of the study is the mode of administration of the test. The videotaped evaluations provided a medium for enhancing reliability but did not replicate the clinical setting. The ability to score the GMFM was tested but not the ability to administer the measure. Future studies are required to determine the interrater reliability of the live observation as well as the validity and responsiveness of the GMFM in children with OI.
While the validity of the GMFM for use with children diagnosed with OI has not been addressed in this study, excellent reliability has been demonstrated. Pediatric physical therapists can be trained to score gross motor function with precision. The GMFM has proven to be a reliable and safe measure for children with this diagnosis. Despite the limitations, videotaping has been shown to be an effective and practical means to estimate interrater reliability.
We wish to thank physiotherapists Louise Loiselle, Rochelle Rein, Huyen Phan, and Rita Yap for their cooperation with the interrater reliability section of the study and Louise Toupin for her clerical support. Gratitude is extended to Denis Alves and Jane Wishart for videotaping the children, as well as the Shriners Hospital for providing the research setting and support.
1. Byers PH, Steiner RD. OI. Annu Rev Med. 1992; 43: 269–282.
2. Castells S. New approaches to treatment of OI. Clin Orthop. 1973; 93: 239–249.
3. Escalante A, Beardmore TD. Decreased bone mineral density in HLAS. B27 positive members of a family with OI. J Rheumatol. 1993; 20: 320–324.
4. Kurtz D, Morrish K, Shapiro J. Vertebral bone mineral content in OI. Calcif Tissue Int. 1985; 37: 14–18.
5. Zionts L, Nash J, Rude R. Bone mineral density in children with mild OI. J Bone Joint Surg Br. 1995; 77: B143–B147.
6. Rowe D, Shapiro J. OI: In Metabolic Bone Diseases and Clinically Related Disorders. Philadelphia: WB Saunders; 1990: 659–701.
7. Charnas L, Marini J. Neurologic, and Developmental Outcome in OI. Proceedings of Fifth International Conference on OI. Oxford, UK; Oxford Press; 1993: 50.
8. Engelbert R, Helders P, Keeson W. Intramedullary rodding in type III OI effects on neuromotor development in 10 children. Acta Orthop Scand. 1995; 66: 361–364.
9. Daly K, Wisbeach A, Sanpera I. The prognosis for walking in OI. J Bone Joint Surg Br. 1996; 78: 477–480.
10. Sillence D, Senn A, Danks D. Genetic heterogeneity in OI. Am J Med Genet. 1979; 16: 101–116.
11. Byers P. OI in Connective Tissue and Its Heritable Disorders. New York: Wiley Liss; 1993; 317–350.
12. Plotkin H, Rauch F, Bishop N. Pamidronate treatment of severe OI in children under 3 years of age. J Clin Endocrinol Metab. 2000; 85: 1846–1850.
13. Buschang PH, Tanguay R, Demerjian A. Growth instability of French-Canadian children during the first three years of life. Can J Public Health. 1985; 76: 191–194.
14. Porat S, Heller E, Seidman D. Functional results of operation in OI: elongating rods and nonelongating rods. J Pediatr Orthop. 1991; 10: 200–203.
15. Catell HS, Clayton B. Failure of anabolic steroids in the therapy of OI. J Bone Joint Surg Am. 1968; 50A: 123–141.
16. Glorieux F, Bishop N, Plotkin H, et al. Cyclic administration of pamidronate in children with severe OI. N Engl J Med. 1998; 339: 947–952.
17. Soderhall S, Astrom E, Skoog L. Improvement of pain and life quality during APD treatment of a girl with OI type III. Proceedings of the Fifth International Conference on OI. Oxford, England: Oxford Press; 1993; 78.
18. Gerber L, Binder H, Weintrob J. Rehabilitation of children and infants with OI. A program for ambulation. Clin Orthop. 1990; 251: 254–262.
19. Engelbert R, Beemer F, vas der Graaf Y, et al. OI in childhood—a follow-up study. Arch Phys Med Rehabil. 1999; 80: 896–903.
20. Vetter U, Pontz B, Zauner E. OI. A clinical study of the first ten years of life. Calcif Tissue Int. 1992; 50: 36–41.
21. Engelbert R, Custers J, van der Net J. Functional outcome in OI: disability profiles using the PEDI. Pediatr Phys Ther. 1997; 9: 18–22.
22. Engelbert R, Vampaemel L, Keesen W, et al. Gross Motor Development in Children with OI (O.I.) Type III. A Retrospective Follow-up. Proceedings of the Sixth International Conference on OI. Netherlands: 1996; 75.
23. Bleakney D, Kruse R. Gross motor development and children with OI. Proceedings of Sixth International Conference on OI. Netherlands: 1996; 31.
24. Haley S, Coster W, Ludlow L. Pediatric Evaluation of Disability Inventory. Boston, Mass: New England Medical Center Hospitals: 1992.
25. Folio M, Fewell R. Peabody Developmental Motor Scales and Activity Cards. Hingham, Mass: Teaching Resources Corp; 1983.
26. Hughes JE. Basic Gross Motor Assessment. Golden, Colo: Jeanne Hughes; 1979.
27. Bruinicks RH. Bruinicks-Oseretsky test of motor proficiency;examiner’s manual. Circle Pines, Minn: American Guidance Service Inc; 1978.
28. Russell D, Rosenbaum P, Gowland C. Gross Motor Function Measure Manual. 2nd ed. Hamilton, Ontario: Gross Motor Measures Group; 1993.
29. Russell D, Rosenbuam P, Cadman D. The Gross Motor Function Measure: a means to evaluate the effects of physical therapy. Dev Med Child Neurol. 1989; 31: 341–352.
30. Russell D, Palisano R, Walter S, et al. Evaluating motor function in children with Down syndrome: validity of the GMFM. Dev Med Child Neurol. 1998; 40: 693–701.
31. Parker DF, Carriere L, Hebestreit H, et al. Muscle performance and gross motor function of children with spastic CP. Dev Med Child Neurol 1993; 35: 17–23.
32. McLaughlin J, Bjornson K, Astley S, et al. Ability to detect functional change with the gross motor function measure: a pilot study. Dev Med Child Neurol. 1991; 33(Suppl 64): 26.
33. Streiner D, Norman G. Health Measurement Scales: A Practical Guide to Their Development and Use. 2nd ed. Oxford, England: Oxford University Press; 1996; 108.
34. Badke M, Di Fabio, Leonard E. Reliability of a mobility assessment tool with application to neurologically impaired patients: a preliminary report. Physiother Can. 1993; 45: 1, 15–20.
35. Gross D, Conrad B. Issues related to the reliability of videotaped observational data. West J Nurs Res. 1994; 13: 799–803.
36. Nunally JC. Psychometric Theory. 2nd ed. New York: McGraw-Hill; 1978.
37. Helmstadler GC. Principles of Psychological Measurement. New York: Appleton-Century-Crofts; 1964.
38. Shrout PE, Fleiss JL. Intraclass correlation: uses in assessing rater reliability. Psychol Bull. 1979; 86: 420–428.
39. Fleiss JL. The Design and Analysis of Clinical Experiments. New York; John Wiley & Sons; 1986; 12, 27.
40. Cohen JA. Coefficient of agreement for nominal scales. Educ Psychol Meas. 1960; 20: 37.
41. Landis RJ, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1974; 33: 159–174.
42. Bjornson KF, Graubert CS, Buford, VL, et al. Validity of the Gross Motor Function Measure. Pediatr Phys Ther. 1998; 10: 43–47.
43. Portney LG, Watkins MP. Foundations of Clinical Research Applications to Practice. East Norfolk: Appleton and Lange; 1993 505–528.
© 2001 Lippincott Williams & Wilkins, Inc.