Maring, Joyce R. PT, EdD; Elbaum, Leonard PT, EdD
The Early Intervention Program for Infants and Toddlers with Disabilities, Part C under the Individuals with Disabilities Education Act (IDEA) requires participating states and jurisdictions to provide services to eligible infants and toddlers.1 To be eligible for federal funding, a multidisciplinary evaluation, including comprehensive evaluation activities related to the child’s development, must be performed. The evaluation is to be provided by appropriate, qualified personnel to determine a child’s initial and continuing eligibility for services. States are required to operationally define the term developmental delay according to levels of functioning and identify the procedure they will use to determine the existence of a delay in the areas required by law. State definitions of developmental delay vary widely.2,3 Most states, however, express the eligibility criteria for delay quantitatively such as the difference between chronological age and performance level expressed as a percentage of chronological level, or delay as indicated by one to two standard deviations below the mean on norm-referenced instruments.2 Some states include qualitative criteria such as atypical behavior and informed clinical opinion; however, therapists are encouraged to use precise criteria such as the percentage of delay on developmental assessment instruments on which to base their eligibility recommendations.3
The use of developmental scales is common-place in the early intervention setting. Physical therapists frequently select and administer gross and fine motor scales as part of the evaluation process and subsequently use the resulting scores to make intervention decisions.4 A list of appropriate tests, however, is not always specified as part of the procedure to determine the level of delay, even when precise criteria of delay have been established.2,5 Although pediatric therapists are generally encouraged to use well-developed and standardized tests to determine eligibility for clinical interventions, instrument selections are driven by many factors beyond establishing service eligibility such as cost, time of administration, and training of the individuals administering the test.6 This has led to the use of tests that may not be appropriate to establish levels of developmental delay. For example, in some cases tests have been constructed from selected items taken from a pool of developmental tests, and age equivalents from developmental schedules rather than normative samples.5
At the time of this study, the Miami Dade County Public Schools in Florida typically used the results of the Early Intervention Developmental Profile (EIDP)7 to determine the child’s eligibility for early intervention services. Specifically, physical therapy services were offered to children whose gross motor age as determined by the EIDP was approximately 25% below age-related norms. For example, a child who was 12 months old would be eligible for physical therapy services if his or her gross motor equivalent age was nine months or less. The EIDP is an example of a test that was constructed from a pool of developmental tests with age equivalents from developmental schedules rather than normative samples.5 As the EIDP scores are not based on a normative sample, it is important to examine the validity of age equivalence measurements yielded by this instrument against a well established, norm-referenced test such as the Peabody Developmental Motor Scale 2 (PDMS-2).8 It may be possible to arrive at a different decision regarding a child’s eligibility for services depending on which scale is administered.
The purpose of this study was to determine the concurrent validity of the EIDP and the PDMS-2 in examining gross motor delay in children attending an early intervention program. The gold standard in this comparison is the PDMS-2 based on its wide-spread use, precise scoring criteria, and large normative sample of 2003 children. Characteristics of the normative sample relative to geography, gender, race, and other critical variables such as disabilities were representative of the U.S. population in 1997.8 A recent review of tests of motor development described the PDMS-2 as a valid measure for determining a child’s eligibility of services in early intervention and preschool programs.9
We obtained permission from an early intervention center in Miami, Florida, to recruit subjects from among the children who attended. All the children qualified for the program by exhibiting developmental delays greater than 25% of their chronological age in two or more developmental areas based on the administration of the EIDP, or having an “at risk” diagnosis such as cerebral palsy or Down syndrome. When entering their child into the early intervention program, parents signed the center informed consent to have routine data collected on their child used in research studies. Parents were specifically informed about this research project that might include data collected on their child and they were given opportunity to ask any questions or voice any concerns. The research was approved by the Institutional Review Board of Florida International University.
A sample of convenience of 30 children ranging in age from 12 to 44 months was included in this study. This represented all the children who were scheduled to be examined using the PDMS-2 or the EIDP during a seven-month period. To reject the null hypothesis with a power level of 0.8 and a significance level of 0.05, a sample size of 30 was determined to be adequate if the effect size of the paired t test is medium to high (d = 0.70) and the magnitude of the association is medium to large (r = 0.50).10
Our subjects included 18 boys and 12 girls. The diagnoses on the referrals to the center included 10 children with developmental delay, 10 children with Down syndrome, six children with cerebral palsy, two children with microcephaly, and two children born prematurely. Because the two children who were born prematurely were older than 24 months of age, their ages were not adjusted when administering the tests of motor development.
The PDMS-2 is designed to assess the motor skills of children from birth through five years of age.8 The PDMS-2 is considered a valid and reliable discriminative measure for gross motor and fine motor development.9 The normative sample of 2003 children included 41 children with disabilities. The test is composed of six subtests that include gross motor and fine motor skills. Composite scores are calculated from the results of the subtests. The gross motor score is a result of the subtests that measure the effectiveness of large muscle groups and includes the areas of: reflexes, stationary, locomotion and object manipulation. Instructions in the manual include procedures for converting item scores on test into standard scores, percentile ranks as well as age equivalents. The test takes approximately 45 minutes to one hour to administer.
Psychometric data related to the test reliability for the gross motor subtests have been reported for three potential sources of test error: content, time and scorer. Coefficients for test-retest reliability and interrater reliability are high for gross motor scores across all subgroups (r > 0.89 for all subgroups). The coefficient alphas examining possible sources of error related to content sampling were also consistently high (alpha > 0.93) for all subgroups including the subgroup with physical disabilities. Test developers report high concurrent validity of the PDMS-2 with the Mullen Scales of Early Learning (r = 0.86 for gross motor scores) and with the earlier version of the PDMS (r = 0.84 for gross motor scores).
The first version was widely used because it included curricular activity cards recommending a series of interventions to remediate specific problems identified during the administration of the test.11 The updated version continues to include materials to complete an in-depth examination of gross and fine motor skills as well as a Peabody Motor Activities Program. The Peabody Motor Activities Program includes suggestions for treatment activities to facilitate the child’s development in a specific skill area. Additionally, in response to previous criticisms, the updated version has included precise criteria for scoring each of the items.12,13
The EIDP is part of a criterion-referenced developmental scale developed for children between zero and six years of age who exhibit developmental delay.7 The scale was part of the Early Intervention Project for Handicapped Infants and Young Children and was used in an outreach program of the University of Michigan. The scales were not established by testing the items on a representative population; items were assigned to specific ages based on a review of literature. The EIDP manual reports correlation coefficients for a sample of 14 children upon a comparison with the Bayley Scales of Infant Development, the Vineland Social Maturity Scale, the Receptive-Expressive Emergent Language Scale, and a clinical motor evaluation. Correlations ranged from 0.33 to 0.96. The Early Intervention Developmental Profile was first published in 1975 and revised in 1981. It was one of the first developmental scales that included children under the age of three years. In spite of its lack of normative data, it is widely used in early intervention settings, including Miami-Dade County, Florida. Items are scored on a pass or fail basis. The age equivalents are determined based on the number of items a child passes.
The EIDP and the PDMS-2 were administered to the 30 participants at the program site. Two physical therapists trained in the administration of these tests collected the data on the children for which they routinely provided intervention. Both therapists had attended in-service training sessions at the early intervention center on test administration and scoring for both the EIDP and PDMS-2. The PDMS-2 manual suggests a therapist practice the test implementation and scoring a minimum of three times prior to using the test results for reporting purposes. Both therapists exceeded this recommended minimum; however, we did not perform specific tests of inter-tester or intra-tester reliability. The tests were administered within two weeks of each other, thereby reducing the risk that maturation would significantly impact any differences in scores obtained. Half of the children were tested initially using the EIDP and the other half were tested initially with the PDMS-2. Children were randomly assigned to the test order.
Scores were recorded and the gross motor age equivalents were calculated in accordance with the instructions that accompanied both tests.7,8 In the case of the PDMS-2, the motor age equivalents were obtained in the appropriate subtest areas. The manual requires the scores be based on the stationary, locomotion and object manipulation subtests for children between one and five years of age; for children less than one year of age the scores are based on the reflexes, stationary and locomotion subtests. All the children in this sample were between one and four years old. We calculated an unweighted average of the age equivalent scores on the appropriate subtests to facilitate comparison with the EIDP in which only one score is obtained. Age equivalent scores were used rather than standard scores because the EIDP does not generate standard scores. Although the PDMS-2 manual8 encourages the use of standard scores or percentile ranks rather than age equivalent scores, the authors acknowledge that age equivalents are currently mandated by many educational agencies and school systems. As noted previously, states frequently require the level of delay be quantified by expressing the performance level based on age equivalence as a percentage of the chronological age.2 The PDMS-2 test developers describe the current necessity of age equivalent scores as a means of communicating about a child’s competence in a language that the parents and team members all understand.
We examined the strength of the association between the two tests using the Pearson product moment correlation. We examined the difference between the mean age equivalent scores of the children on the two tests using a matched-pairs t test.
The Pearson product moment correlation coefficients indicate the age equivalent scores obtained by the two tests are strongly correlated. The average PDMS-2 age equivalent scores are strongly correlated with the age equivalent scores obtained on the EIDP (r = 0.91, p < 0.01). We also found that all the PDMS-2 subtests strongly correlate with the EIDP. Table 1 presents the correlations between the average PDMS-2 age equivalent score and the EIDP age equivalent score as well as between the subtests of the PDMS-2 and the EIDP. Figure 1 displays the individual scores on the PDMS-2 and the EIDP.
The matched-pairs t test indicated that the mean age equivalent scores obtained by the two tests were significantly different. The mean age equivalent scores, ranges and standard deviations are reported in Table 2. The children’s age equivalent scores were significantly higher on the EIDP than on the PDMS-2 (t = 3.96; p = 0.001; d = 0.72). An average difference of 3.8 months was found between the two scores (see Tables 2 and 3). In other words, for our sample, the EIDP estimated a gross motor age that was, on average 26% higher than that obtained by the PDMS-2.
In our group, the EIDP yielded gross motor age equivalents that were strongly correlated with, but significantly higher than gross motor age equivalents obtained with the PDMS-2. In other words, the EIDP has poor concurrent validity relative to the PDMS-2 on gross motor age equivalent scores, the measure frequently used to substantiate a child’s eligibility for services according to many state guidelines. A recent study comparing the age equivalent scores between the PDMS-2 and the Bayley Scales of Infant Development II (BSID II) also showed low concurrent validity.14 Even though the two instruments in the comparison were standardized using normative samples, there was little agreement between the age-equivalent scores generated by the two tests. The authors discussed the possibility of differing levels of test sensitivities and specificity in identifying children with a disorder as a potential explanation for this finding.
It would appear from our study as well as the PDMS-2 and BSID II comparison, differences in scores of selected tests can potentially affect clinically important decisions, such as the eligibility of a child for special services in his or her community. Physical therapists are increasingly required to substantiate their intervention decisions empirically and quantitatively in a climate of diminishing resources. Many states require quantitative measures of delay as part of substantiating a child’s eligibility for physical therapy services. Therapists need to be aware of how a selected test might affect eligibility decisions when those decisions are based on the degree of developmental delay. The comparison in this study, suggests children would more likely be eligible for physical therapy services if the PDMS-2 rather than the EIDP were used in the child’s evaluation.
Numerical indicators of the magnitude of delay are likely to continue to be a major criterion for eligibility for a range of therapeutic services.2,3 Subject specialists and families consider the need of the child with delays to acquire age appropriate developmental milestones as an important objective of early intervention.15 However, given the complexity of a child’s development it would seem reasonable to include many factors to establish eligibility for, and potential benefit from, early intervention programs, including a wide range of clinical observations and assessments.6 Tests that involve a restricted sampling of children’s behaviors should rarely comprise the sole justification for special services.16 Early intervention specialists recognize the need to broaden the examination to include the interactions of the child with salient individuals in his or her world.17,18 McConnell18 suggests examination tools be developed that reflect new paradigms and include a child’s behavior viewed in an ecological perspective. With this perspective, an examination of a child’s development would include an assessment of the activity and participation levels of the child within his or her environmental context.18,19 Physical therapists should exercise clinical judgment about a child’s eligibility for services that is informed but not governed by the implementation of valid instruments measuring developmental delay. Recommendations for monitoring and intervention need to be based on a comprehensive evaluation rather than the scores of one test.
This study has several important limitations. All the children included in this study demonstrated developmental delay on both tests and were determined eligible for special services regardless of whether the PDMS-2 or EIDP was used to assess their level of delay. Future studies comparing these instruments should include children who are developing typically as well as children developing atypically to adequately assess the tests’ levels of sensitivity and specificity in identifying motor delay. Additionally, although the clinicians implementing the test were trained and experienced, intertester and intratester reliability was not established prior to the test administrations. Therefore, it is possible that differences in scores between the two tests were related to a lack of reliability in the test implementation and scoring. Finally, the lack of standard scores or percentile rankings yielded by the EIDP required us to make a comparison between the two instruments using age equivalent scores. According to the authors of the PDMS-2, standard scores and percentile rankings would serve as preferable measures in any outcome report.
Although Part C of IDEA assists states in providing comprehensive community services to infants and toddlers with disabilities and their families, the task of defining the eligible population is an ongoing challenge. Many states have established precise quantitative criteria in determining delay; however, norm-referenced tests are not always required as part of the eligibility decision. Age equivalent scores between the PDMS-2 and the EIDP, although strongly correlated, also were significantly different, with the EIDP potentially underestimating the degree of gross motor delay present.
Early intervention specialists highlight the importance of accurate and well-integrated evaluation approaches in the field that will reduce the uncertainty about when and how to intervene. This requires therapists to understand the limitations of each examination instrument, and particularly instruments that have not been established using normative samples, as a basis for eligibility decisions. Test scores should inform but not replace the clinical judgment of a physical therapist about a child’s potential to benefit from specialized early intervention services. Scales measuring motor development are one component of a comprehensive examination.
1. Federal Regulations, U.S. Department of Education, 34 C.F.R. Part 303 (2001).
2. Shakelford J. State and jurisdictional eligibility definitions for infants and toddlers with disabilities under IDEA. In: Danaher J, Armijo C, eds. Part C Updates. 7th ed. Chapel Hill: The University of North Carolina, FPG Child Development Institute, National Early Childhood Technical Assistance Center; 2005:37–52.
3. Spiker D, Hebbeler K, Wagner M. A framework for describing variations in state early intervention systems. TECSE. 2000;4:195–207.
4. Quin L. Gordon J. Functional Outcomes Documentation for Rehabilitation. St. Louis: Elsevier Science; 2003.
5. Fewell RR. Assessment of young children with special needs: Foundations for tomorrow. TECSE. 2000;20:38–42.
6. Missiuna C, Pollock N. Beyond the norms: Need for multiple sources of data in the assessment of children. Phys Occup Ther Pediatr. 1995:57–73.
7. Rogers SJ, Donovan CM, D’Eugenio DB, et al. Early Intervention Developmental Profile (Revised). Ann Arbor, MI: University of Michigan Press; 1981.
8. Fewell RR, Folio MR. Peabody Developmental Motor Scales. 2nd ed; Austin: Pro-Ed; 2000.
9. Tieman BL, Palisano RJ, Sutlive AC. Assessment of motor development and function in preschool children. Ment Retard Dev Disabil Res Rev. 2005;11:189–196.
10. Browner WS, Black D, Newman TB, et al. In: Hulley SB, Cummings SR, eds. Designing Clinical Research. Baltimore: Williams & Wilkins; 1988:139–150.
11. Horvat M, Kalakian L. Assessment in Adapted Physical Education. 2nd ed. Dubuque, IA: Brown & Benchmark; 1996.
12. Hinderer KA, Richardson PK, Atwater SW. Clinical implications of the Peabody Developmental Motor Scales: A constructive review. Phys Occup Ther Pediatr. 1989;9:81.
13. Palisano R. Concurrent and predictive validities of the Bayley Motor Scale and the Peabody Developmental Motor Scales. Phys Ther. 1986;66:1714–1719.
14. Provost B, Heimerl S, McClain C, et al. Concurrent Validity of the Bayley Scales of Motor Development II Motor Scale and the Peabody Developmental Motor Scales-2 in children with developmental delays. Pediatr Phys Ther. 2004;16:149–156.
15. Barnett DW, Pepiton AE, Bell SH. Evaluating early intervention: Accountability methods for service delivery innovations. J Spec Educ. 1999;33:177–188.
16. Neisworth JT, Bagnato SJ. The mismeasure of young children: The authentic assessment alternative. Infants Young Child. 2004;17:198–212.
17. Fewell RR. Trends in the assessment of infants and toddlers with disabilities. Except Child. 1991;58:166–173.
18. McConnell SR. Assessment in early intervention and early childhood special education: Building on the past to project into our future. TECSE. 2000;20:43–48.
19. Sameroff A, Fiese B. Transactional regulation and early intervention. In: Meisels SJ, Shonkof JP, eds. Handbook of Early Childhood Intervention. New York: Cambridge University Press; 1990:123–172.
© 2007 Lippincott Williams & Wilkins, Inc.