In response to the Accreditation Council for Graduate Medical Education's (ACGME's) requirement to shift the focus of graduate medical education (GME) to educational outcomes,1 we redesigned the goals and objectives of the University of Maryland School of Medicine's pediatric residency program in 2001 to develop a competency-based system that focuses on the six ACGME competencies. In 2002, we established performance standards for the new learning objectives by asking pediatric program directors to predict resident performance, by year of training, for each objective.2 As Joorabchi and Devries3 and Gaskin et al4 have shown, however, educators' expectations of outcomes may differ from actual outcomes. They found that resident performance did not meet faculty expectations and that the gap between expectations and performance widened as the level of training advanced. This suggests that faculty expectations are not necessarily congruent with the typical rate of skill progression. Although the stages of skill development are well defined in developmental models (e.g., Dreyfus and Dreyfus5), the rate of development is not.
In this article, we analyze three years' worth of data on pediatric residents' performance on learning objectives related to the ACGME patient care competency to determine whether a gap exists between residents' actual performance and program directors' predictions of their performance in our 2002 survey. In addition, we evaluate the concurrent validity of the standards set for these learning objectives by comparing them with a well-accepted global performance measure.6
We initially developed learning objectives that in the aggregate addressed the six ACGME competencies and applied a scoring rubric to each objective, as detailed in an earlier publication.2 The rubric was dichotomous (demonstrates or does not demonstrate a behavior), related to the percentage of time a behavior was demonstrated (less than 25%, 25%–50%, 50%–75%, or more than 75%), or related to complexity of the patient (a routine patient up to any patient, regardless of acuity or complexity). We then surveyed members of the Association of Pediatric Program Directors, asking them to help us establish performance standards by predicting the level of performance by year of training for each of the objectives. Eighty-one of 202 (40%) program directors responded and demonstrated overwhelming consensus with regard to expected performance standards for each learning objective for second-year residents and unanimous consensus for third-year residents.2
In 2002, we implemented a new assessment system based on these learning objectives and standards at our pediatric residency program through the use of a Web-based portfolio. This system and its tools have been previously described in detail.7 Extensive faculty development was provided around the use of the Web portfolio and the new methods of assessment via group workshops, one-on-one meetings, clear written instructions, and ongoing availability for questions and clarifications.
The ACGME patient care competency is composed of seven subcompetencies8 for which we created 35 learning objectives that in the aggregate addressed clinical competence in patient care (List 1). The tools used to assess residents' performance on these learning objectives were global assessment tools that faculty members completed on the basis of repeated observations in the context of clinical care delivery in both inpatient and outpatient settings; thus, they are equivalent to the “does”/action level at the top of Miller's9 pyramid of skill assessment.
We prospectively collected data on resident performance on the competency-based learning objectives during 2002–2005; we closed the data set with the completion of the assessments for the residents who graduated at the completion of academic year (AY) 2005. During the course of the following year, we extracted the data from the Web-based portfolio for purposes of statistical analysis, which we conducted in 2007–2008.
For the purposes of this study, we analyzed only data pertaining to the patient care competency for two reasons: (1) the number of assessments in the patient care domain was adequate to evaluate whether actual performance matched program directors' predictions, and (2) all patient care learning objectives were matched to scoring rubrics that required residents to demonstrate progression of skill acquisition over time. We included all assessments of the residents' abilities to meet the patient care learning objectives during the study time period.
During 2002–2005, faculty members completed 8,974 individual assessments of 40 residents on these 35 learning objectives. We followed 13 residents through three years, 14 through two years, and 13 through one year of residency training. First-year residents were assessed 5,445 times, second-year residents 2,229 times, and third-year residents 1,300 times.
We compared residents' performance on individual patient care learning objectives with the standards set for those same learning objectives by the consensus of program directors. If a resident reached or exceeded the standard, we gave him or her a “pass” on the objective. For example, if program directors predicted that a second-year resident would be able to demonstrate an objective more than 75% of the time but a resident was only able to demonstrate it 50% to 75% of the time, that resident would not receive a pass for performance on the objective (i.e., the resident did not meet the standard). If a large number of residents did not meet the standard, it would suggest that either the expected standard was set too high or the performance of trainees in this program was below expectations.
To address the latter confounder, at the end of AY 2005 we asked 23 senior faculty members, who had worked in close association with the residents in clinical settings and had completed assessments within the Web-based portfolio, to reflect on their experience with the residents and evaluate the residents' overall performance in each of the six ACGME domains of competence using a modified American Board of Internal Medicine (ABIM) competency card.6 We chose this tool for its widespread use as well as its construct validity in the GME community. The senior faculty members were asked to evaluate only those residents with whom they had enough experience to provide meaningful feedback.
Finally, because the program directors' predictions were based on expected performance at the end of each training year whereas residents are assigned to clinical rotations at different times throughout the academic year, we used a least squares regression analysis to examine the impact of time of academic year in which the assessment was done on actual performance.
The institutional review board at the University of Maryland School of Medicine granted us exemption from informed consent for trainee participation based on implementation of a new system of assessment for all trainees in our program and anonymity of any reported data.
The mean pass rate for the 35 learning objectives was 92% for first-year residents (standards were met or exceeded 5,009 of the 5,445 times that one of the objectives was assessed). For the second- and third-year residents, the mean pass rates were 84% (1,872 of 2,229) and 72% (936 of 1,300), respectively.
Given the high overall pass rate in the first year of residency, we chose to focus on the 16% (357 of 2,229) and 28% (364 of 1,300) of individual assessments that fell below the predicted level for residents in the second and third years, respectively. If these below standards assessments were equally distributed across the 35 learning objectives, then we would expect that each objective would contribute approximately 3% of the variance to these assessments. However, we found that a small number of objectives accounted for a disproportionate share of the below-standard assessments (Table 1).
Two themes emerged from this analysis of below-standard assessments: competence in patient management and procedural competence. Three of the five patient management subcompetency objectives accounted for 23% (82 of 357) and 14% (51 of 364) of the second-year and third-year residents' below-standard assessments, respectively. The four objectives from the procedural subcompetency accounted for 24% (86 of 357) and 12% (43 of 364) of the below-standard assessments for the second- and third-year residents, respectively. The remainder of the assessments demonstrated remarkable congruence between the residents' performance as assessed by faculty and their performance as predicted by the program directors.
The least squares regression analysis of all assessments for the patient care competency from all rotations, by year of training, demonstrated that the potential confounder of time of academic year at which skills were evaluated was only relevant for first-year residents, where R 2 = 0.78 and P < .0004. There was no significant correlation of performance (i.e., ability to meet expected standards for learning objectives) with time of academic year at the second- and third-year levels.
At least 16 senior faculty members evaluated each second-year and third-year resident using the ABIM competency card as a comparative assessment tool. Each resident met or exceeded expectations for competence in patient care based on the faculty members' experience with the resident at the completion of AY 2005, which indicates that poor resident performance was not a potential confounder.
Our intent in creating learning objectives and performance standards to address the ACGME competencies was to take us a step closer to developing realistic outcome measures for graduating pediatric residents. Studies in the medical literature demonstrating gaps between expected and actual performance, however, prompted us to question the standards that we had set for the ACGME competencies with the help of other pediatric program directors. Joorabchi and Devries,3 for example, developed an observed structured clinical examination and worked with faculty members to develop minimum pass levels for residents in each year of training. Their results showed that only 59% of first-year, 45% of second-year, and 4% of third-year residents met the predetermined standard. Like their data, our data demonstrate a trend toward decreasing pass rates with advancing years of training. However, our actual pass rates of 92%, 84%, and 72% in the first, second, and third years of residency, respectively, are substantially higher than those they reported, which suggests that the program directors we surveyed were generally realistic in their standard setting. In fact, for the five patient care subcompetencies besides procedural skills and a subset of patient management, the program directors' expectations were near perfect matches to actual resident performance.
We considered several explanations for the gap between expectations and performance in the procedural skills and patient management subcompetencies. First, the time of year that performance was assessed could be a potential confounder in the outcome of pass rate, but our regression analysis did not show this to be the case. Second, the third-year group that was followed from year one could have had a disproportionate number of underperformers. If that were the case, we would have expected to find the same rate of underperformance for this group of residents in both their first and second years of residency, but we did not. Third, the residents overall could have been underperforming in patient care, but the faculty evaluations using the modified ABIM competency card did not support this explanation.
The most likely conclusion is that the widening gap between program directors' expectations and residents' performance in this study is attributable to the program directors' unrealistic expectations about the rate of skill progression which lead them to set unrealistic standards. We attributed most of this gap to standards related to procedural competencies and to a subset of patient management competencies. The former likely reflects the dwindling number of opportunities for invasive procedures in pediatric practice. The latter is most likely the one true area in which rate of skill progression failed to meet high expectations. More noteworthy is the remarkable accuracy of the program directors' predictions for residents' performance on the majority of the remaining patient care learning objectives.
This study has some limitations. We collected data from a single institution, the University of Maryland Hospital for Children. Our analysis focused on only the patient care competency because the number of individual assessments for the other competencies was limited and the scoring rubric used for some of the other learning objectives was dichotomous (does or does not demonstrate a behavior); therefore, the data did not lend themselves to studying progression of skill acquisition over time. In addition, the data are weighted by year of program entrance. There were more assessments of the group of interns that entered as first-years in 2002 because the data were collected during their three years of training. The intern group entering in 2003 had two years of assessments included in this analysis, whereas the group entering in 2004 had only one year. We did not do a formal study of reliability; however, our faculty members' judgments demonstrated overall consensus on resident performance based on residents' ability to meet standards both on the learning objectives and the modified competency card.
Our growing experience in assessing competence on these learning objectives also leads us to speculate on some limitations of the scoring rubrics, which we developed in the early days of this work.2 We are now calling into question the rubric that indicates the percentage of time that a resident demonstrates a behavior. First, we set the rating categories up as ranges because faculty members could not be expected to be more exact in assessing a resident's behavior through a series of observations. Second, the rubric itself may have limited utility: Refining the acquisition of some skill sets warrants changing behaviors to meet the needs of the specific situation rather than performing the same behavior more frequently. For example, the ability to take a comprehensive history may be critical for a more junior learner, but skill advancement is likely to be demonstrated by the resident's capacity to do a more focused history when the clinical situation demands. Lessons learned from these scoring rubrics will guide us as we develop future tools for learner assessment.
The pediatric community is engaged in the “Milestones Project,” a joint initiative of the ACGME and the American Board of Pediatrics to define performance standards or “milestones” for residency training in the specialty as the next step in defining desired outcomes of competency-based education.10 Two of the authors (C.C. and R.E.) are involved in this project and bring an important lesson from our work to bear on this new initiative. Narrative descriptions of behaviors that demonstrate developmental progression over time will replace the assessment rubrics described in this work. Most important, before relying on these new milestones to make judgments about individual resident performance and aggregate resident performance for purposes of program accreditation, the pediatric community needs to study them rigorously to determine whether predictions about the progression of skill development and expectations for performance at points along the educational continuum are realistic and achievable.
In The Wisdom of Crowds, Surowiecki11 suggests that large groups of people are better at problem solving than any individual, regardless of ability. The program directors' ability to predict performance outcomes as demonstrated in this study seems to substantiate his theory. There was a relatively small number of learning objectives for which predicted performance did not meet expectations. We will continue to draw on the “wisdom of the crowd” of program directors as we move toward the next iteration of standard setting.
This work was funded in part through the generosity of the Health Resources and Services Administration.
Exemption granted by the institutional review board at the University of Maryland School of Medicine.