The introduction of core competencies by the Accreditation Council for Graduate Medical Education represents a paradigm shift for training programs, one that places less emphasis on ensuring that trainees are exposed to a variety of clinical conditions and more emphasis on the mastery and application of skills. This increased attention on core competencies has prompted discussion about the adequacy of methods used to evaluate trainees’ skills at the resident level and, to a lesser extent, the student level.1,2 Standardized patient assessment allows for evaluation of patient care and communication skills with high reliability and validity.3
Medical schools in the United States began experimenting with clinical skills assessment programs using standardized patients in the mid-1970s. While the benefits of such programs were apparent to early adopters, clinical evaluation in undergraduate medical education continued to rely on less reliable but more easily gathered data, such as evaluations by students’ supervisors4 or knowledge assessment.5 Even when the deficiencies of traditional evaluation methods were appreciated, the cost and effort needed to establish rigorous clinical skills assessment programs likely delayed or precluded adoption of such programs at some sites.
To eliminate the financial barrier and promote collaboration, in the early 1990s the Josiah Macy Jr. Foundation (Macy Foundation) funded six regional medical school consortia to create standardized patient examinations. Medical school faculty applauded the examinations for evaluating fundamental competencies in realistic encounters and providing valuable information about students’ skills.6 Further, medical school administrators perceived creative and financial benefits to collaboration.
Today, over half of all U.S. medical schools require that students participate in clinical skills examinations using standardized patients,7 and a standardized patient examination was implemented as the United States Medical Licensing Examination (USMLE) Step 2 Clinical Skills Examination (CS) for students graduating in 2005. If and how the graduate medical education competencies and the new licensing requirement will affect clinical skills assessment programs in medical schools is unknown. We conducted this study to describe the characteristics of clinical skills assessment programs nationally, determine how often and in what ways schools collaborate on the examinations and if collaborative programs differ from independent programs, and to solicit opinions regarding how the USMLE Step 2 CS examination will affect in-house clinical skills assessment programs.
We conducted a cross-sectional descriptive study by surveying medical school curriculum deans at 121 LCME-approved U.S. medical schools in fall 2004. We also obtained data describing institutional characteristics, including public versus private status (UnivSource), region as defined by the Association of American Medical Colleges (AAMC), research funding (amount of federal research contract and grant funding, as reported by the AAMC), and number of enrolled students (as reported by the AAMC). A cover information sheet served as a waiver of informed consent. The University of California, San Francisco Committee on Human Research approved the study.
Names and e-mail addresses for medical school curriculum deans were extracted from the AAMC Group on Educational Affairs database of undergraduate medical education representatives. Names were verified through personal knowledge of the investigators, by review of the institution Web site, or by phoning the institution.
Our 43-item survey was distributed through an online survey software system. Subjects were invited to participate via e-mail. Nonresponders were sent up to five follow-up e-mails. Prior to distribution, the survey was pilot tested for clarity and completeness by local medical educators.
Participants were asked whether they conduct clinical skills assessments in each of the four years of the curriculum, and whether they conduct a comprehensive clinical skills assessment in the third or fourth year. Participants at schools with comprehensive assessments were asked about program characteristics, collaboration with other schools, funding sources, and program costs. We did not define “comprehensive clinical skills assessment” in our survey because this terminology appears in previously published literature7 and we felt our respondents would be familiar with the concept. However, for schools that indicated they have a comprehensive clinical skills examination, answers to survey questions about number and length of stations were reviewed, and if these did not seem robust enough for a comprehensive examination the respondent was contacted for clarification. Two schools that had initially indicated that they offer a comprehensive examination subsequently changed their response. All respondents, including those who do not offer comprehensive clinical skills assessments, were asked to anticipate the impact of the USMLE Step 2 CS examination on their in-house clinical skills assessment programs.
We analyzed response data using SPSS software (SPSS Inc., Chicago, IL). Basic descriptive statistics were computed to gauge institutional characteristics. Chi-square analysis and Fisher exact tests were used to assess demographic differences between respondents and nonrespondents, and to examine the relationship between status as a Macy-funded or collaborating school and assessment program characteristics. Analyses of variance were used to examine differences in length and number of stations and the longevity of the examination between Macy-funded and non-Macy-funded schools, and between collaborating and noncollaborating schools.
Ninety-one of 121 medical school curriculum deans completed questionnaires (75% response rate). There were no significant differences between responding and nonresponding schools in terms of geographic region, enrollment size, or size of research program (p > .1). Public schools were underrepresented in our sample, with 53 of 77 public schools responding, versus 38 of 44 private schools (69% versus 87%, p = .03). Of the 26 medical schools originally funded by the Macy Foundation, 22 responded (85%).
Seventy-six of the 91 respondents (84%) conduct a comprehensive clinical skills assessment during the third or fourth year of medical school. In addition, 73% (66/91) administer an assessment during or at the end of the first year of medical school, 90% (82/91) during or at the end of the second year, and 70% (64/91) during the third-year clerkships. “Instruction and evaluation” was endorsed as the primary purpose of the examination by 77% of schools that offer an examination in the first year of medical school, 84% that offer an examination in the second year, and 75% that offer an examination during the third-year clerkships. In contrast, the comprehensive assessment administered at the end of the third or during the fourth year is primarily for evaluation in 55% of schools and primarily for “instruction and evaluation” in 42%.
Schools with comprehensive assessments administered in the third or fourth year have conducted their examinations for an average of seven years (range, one to >20 years; standard deviation, 4.89), though 24 schools (32%) launched their programs within the past three years. Assessments include an average of 8.41 stations (range, four to 15; standard deviation, 2.68), and in 67 schools (88%) each encounter lasts 11–20 minutes. Most schools (61%) conduct their examinations in a dedicated clinical skills center.
Participation in the examination is required by 74 institutions (97%), although only 70% (53/76) require students to pass. Normative (50%) and criterion referenced (58%) grading are equally common, and some institutions use both methods. Examinations are almost always scored by standardized patients (95%), and are often scored by faculty (67%) or multiple graders (68%). Students who do not pass are most commonly required to participate in a remediation preceptorship (57%) and/or repeat and pass the examination (39%).
Forty-three percent (33/76) of institutions with comprehensive assessments initiated their programs with external funding. Only 5% currently receive grant or donor support, and 91% currently rely on dean’s office funding. The vast majority (95%) of schools with comprehensive assessments anticipate continuing the program for at least three more years.
There were no significant differences between schools with and without examinations in terms of geographic region, enrollment size, private or public status, or size of research program (p > .1). Of schools with examinations, there were some differences between the 21 schools originally funded by the Macy Foundation and their 55 non-Macy-funded counterparts. Macy-funded schools have administered their examinations for significantly longer: 10.05 years (standard deviation, 5.67) versus 5.89 years (standard deviation, 4.56) for non-Macy-funded schools (p = .001). Macy-funded schools more commonly use a clinical skills center shared by multiple institutions (24% versus 7%, p = .05), and more commonly use normative grading (71% versus 42%, p = .02).
Thirty-two schools (42%) with assessment programs currently collaborate on their examinations. As shown in Table 1, collaborating schools have administered comprehensive assessments for significantly longer than noncollaborating schools: 5.98 years (standard deviation 4.57) versus 8.50 years (standard deviation 4.89) (p = .03). Collaborating schools are more likely than noncollaborating schools to administer the examination in a clinical skills center shared with other institutions, while noncollaborators are more likely to conduct examinations in ambulatory clinic space that is also used for actual patients. Collaborators are also more likely to require students who fail the examination to participate in a classroom-based skills seminar, and are more likely to videotape or otherwise record the examination.
Collaborative efforts most commonly involve case development (89% of collaborating schools) and standardized patient training (39%). Perceived benefits of collaboration include shared creative ideas for case development and grading strategies (89%), the development of professional relationships with colleagues (86%), cost savings (57%), opportunities for collaborative research (57%), and shared trainer resources (54%). All respondents, including those who are not currently collaborating, favor independence over collaboration for remediation (91%), determining pass/fail cutoffs (74%), and deciding how to report results in the Medical Student Performance Evaluation (74%).
Sixty-seven percent of schools with (51/76) and without (10/15) comprehensive examination programs believe the USMLE Step 2 CS requirement makes it more important for medical schools to conduct in-house clinical skills examinations. When asked to anticipate how students’ scores on the USMLE Step 2 CS will impact the school’s future plans for a comprehensive clinical skills examination, 18% anticipated student scores would have a great deal of influence, 71% some influence, and 11% no influence.
Our results demonstrate that the majority of medical schools nationally have implemented comprehensive clinical skills assessments using standardized patients during the third and fourth years of medical school, and that curriculum deans view the USMLE Step 2 CS requirement as a motivator to enhance in-house clinical skills assessments. That dean’s office funds, rather than external monies, now support the vast majority of these programs indicates the degree to which comprehensive assessments have become a standard component of the curriculum.
Our results characterize the ways medical schools collaborate on comprehensive clinical skills assessments, most commonly in development phase tasks such as creating cases and training standardized patients. Collaborating schools are more likely to administer the examination in a dedicated clinical skills center and to videotape the encounters, two costly yet important infrastructural elements that can enhance the exam’s rigor while reducing variation. Collaborating schools less commonly use faculty as graders, although both collaborators and noncollaborators employ checklists completed by standardized patients. The reasons for this difference are unclear. Confidence in the validity of checklists completed by standardized patients may increase over time, reducing the need to employ faculty graders. Alternatively, administering the examination at an off-site clinical skills center may hinder faculty participation.
The high prevalence of collaboration noted by respondents is uncommon in medical education. Because examination characteristics are similar in collaborating and noncollaborating schools, the collaborative model could be expanded. Such collaboration could serve as the foundation for multisite research, traditionally lacking in medical education.8,9 Studies based on multisite clinical skills assessments would involve enough students and generate enough detailed data to offset the sample size adequacy and generalizability concerns that hamper many medical education studies.8
Though many schools indicated that collaborations are beneficial, the overwhelming majority of respondents feel that all activities related to standard setting and remediation should remain the responsibility of the individual school. This flexibility in scoring at the medical school level may facilitate feedback to students based on the unique content and emphasis of the school’s curriculum.
Our respondents’ endorsement of the importance of in-house examinations at the onset of a standardized patient licensing requirement may reflect a perceived need to prepare students for the USMLE Step 2 CS exam. However, the majority of schools use standardized patient examinations at multiple times in the curriculum for both instruction and evaluation, particularly at the end of the second year and during the clerkship years. Even with a high-stakes in-house comprehensive assessment, scoring information can translate into individual student learning plans. This opportunity for learning contrasts with the licensing examination, after which students receive only a pass/fail mark.
Limitations of the study include the fact that a lower percentage of public schools responded compared to their private school counterparts. Although there could be response bias, this concern is mitigated by the fact that there were no significant differences between responding and nonresponding schools for any other demographic characteristic. Overall, 84% of our respondents offer a comprehensive standardized patient examination, in comparison to the 75% recently reported by LCME,7 suggesting that a significant number of schools that did not respond to our survey may not currently offer exams. Findings about collaboration may be biased because the survey did not include questions about negative consequences of collaboration. We asked only two questions about the expected impact of USMLE Step 2 CS examination on schools’ in-house assessments. While there is much to investigate on this topic, better information as to whether in-house examinations duplicate or supplement the national examination can be gathered after schools review data on student performance.
The high response rate from a diverse group of medical schools is a study strength. We directed our survey to curriculum deans, individuals we felt would be quite familiar with their schools’ comprehensive assessments, an assumption supported by our finding that dean’s office funds support the majority of these programs.
This study provides an in-depth characterization of the status of clinical skills assessment at a time when many medical schools are augmenting their programs, and thus may inform programmatic changes in assessment nationally. The high degree of collaboration in examination development is encouraging for the field of medical education and highlights areas for further collaboration and research. Although the implementation of USMLE Step 2 CS is currently strengthening perceptions of the importance of in-house clinical skills assessments, future research will elucidate the impact of this licensing requirement as schools and students receive examination results.
The activities reported here were supported (in part) by the Josiah Macy, Jr. Foundation.
1 Accreditation Council for Graduate Medical Education. ACGME Outcome Project. 2005 vol; 1999.
2 Carraccio C, Wolfsthal SD, Englander R, Ferentz K, Martin C. Shifting paradigms: from Flexner to competencies. Acad Med. 2002;77:361–67.
3 Petrusa ER. Clinical performance assessments. In: Norman, GR, Newble DI (eds). International Handbook of Research in Medical Education. Dordrecht: Kluwer Academic Publishers, 2002:673–709.
4 Kassebaum DG, Eaglen RH. Shortcomings in the evaluation of students’ clinical skills and behaviors in medical school. Acad Med. 1999;74:842–49.
5 Epstein RM, Hundert EM. Defining and assessing professional competence. JAMA. 2002;287:226–35.
6 Morrison LJ, Barrows HS. Educational impact of the Macy consortia: regional development of clinical practice examinations: final report of the EMPAC project. New York: Josiah Macy Jr. Foundation, 1998.
7 Barzansky B, Etzel SI. Educational programs in US medical schools, 2003–2004. JAMA. 2004;292:1025–31.
8 Carney PA, Nierenberg DW, Pipas CF, Brooks WB, Stukel TA, Keller AM. Educational epidemiology: applying population-based design and analytic approaches to study medical education. JAMA. 2004;292:1044–50.
9 Dauphinee WD, Wood-Dauphinee S. The need for evidence in medical education: the development of best evidence medical education as an opportunity to inform, guide, and sustain medical education research. Acad Med. 2004;79:925–30.
Moderator: Subha Ramani, MD
Discussant: Gerald Whelan, MD© 2005 Association of American Medical Colleges