In July 1998 the Educational Commission for Foreign Medical Graduates instituted the Clinical Skills Assessment (CSA®) as part of the certification requirements for graduates of medical schools not accredited by the Liaison Committee for Medical Education. Graduates of international medical schools (IMGs), as a group, are accepted in substantial numbers to first-year residency (PGY1) programs in the United States. As a result, internationally-trained physicians make up approximately 25% of the present physician work-force. The reliance of the U.S. medical system on internationally trained physicians mandates, from a public health protection stand-point, that the certification process adequately screen out those individuals who are not ready, due to biomedical knowledge or clinical skills deficiencies, to enter graduate medical education (GME) training programs. These individuals, by virtue of state requirements for graduate medical training, would also not be able to obtain licenses to practice medicine in the United States.
To obtain an ECFMG certificate a candidate must document the completion of all requirements for, and receipt of, the final medical diploma, pass both biomedical knowledge examinations—the United States Medical Licensing Examination (USMLE) Step 1 and clinical science (USMLE Step 2) examinations, obtain an acceptable score on the Test of English as a Foreign Language (TOEFL), and pass the CSA. The CSA was the most recently introduced certification requirement. This performance-based assessment was designed to measure clinical skills in a number of relevant domains, including history taking, physical examination and doctor-patient communication. The specific purpose of CSA was to ensure that IMGs could demonstrate clinical skills at a level comparable to that required of graduating fourth-year U.S. medical students.1 Unlike medical schools that are accredited by the Liaison Committee on Medical Education (LCME), there are currently no well-developed international standards for medical education programs, especially with respect to clinical skills training. Therefore, it was decided that all IMGs, regardless of country of medical training, would be required to pass the CSA to obtain an ECFMG certificate.
Numerous investigations have been completed to support the psychometric adequacy of USMLE Step2,3 and TOEFL scores.4,5 There have also been many studies to support the use of standardized patients (SPs) for assessing clinical skills. Evidence for the reproducibility and validity of scores from these performance-based assessments has been provided from various operational and research studies.6,7,8 From a validity perspective, these investigations have generally concentrated on assessment content and how traditional paper-and-pencil measures and SP evaluations differ in terms of what is measured. For the CSA, the appropriateness of the case content is established through detailed analyses of physicians' practice patterns.9 Other validity evidence has been obtained through examinations of candidates' response processes, the internal structure of the assessment, and relationships of CSA scores to other variables. Unfortunately, while convergent and discriminant validity evidence can be obtained rather easily, acquiring meaningful evidence of the relation of test scores (or decisions) to a relevant criterion (i.e., ability to practice with “real” patients) is complex. For certification exams such as the CSA this task is further complicated by the fact that individuals who do not meet performance standards will not be permitted to enter graduate medical education training programs. Therefore, their performances with “real” patients cannot be studied.
The initial survey was developed and piloted with seven program directors whose residency programs had relatively high percentages of IMGs. Based on the program directors' responses and qualitative feedback regarding the format and content of the pilot instrument, several changes and simplifications were made. The final survey was then sent to 540 program directors in the following five specialties: family practice, internal medicine, obstetrics and gynecology, pediatrics, and psychiatry. These specialties historically have high percentages of IMGs. Sampling was also limited to programs that reported that their resident pools contained at least 25% IMGs. While this sampling strategy would necessarily limit the generalizability of any findings, it was deemed necessary, given the research emphasis of the study. The survey was mailed in June 2000, and reminder postcards were mailed to non-responders three weeks later. After another three weeks, the survey was again sent to those who had yet to respond. Telephone calls were made to program directors who submitted surveys with clearly inconsistent responses, and corrections were made to maximize the amount of useable data.
Although a number of issues were investigated as part of the survey, the focus of this manuscript is to report the comparison of skill levels of PGY1 residents. This group included individuals who had passed the CSA as well as individuals who had been certified by the ECFMG prior to the establishment of the CSA as a certification requirement. Program directors were asked to categorize their 1999–2000 first-year residents based on their clinical skills proficiencies when first entering the graduate program. For each skill domain, the program directors indicated the number of residents, if any, who had entered their program with deficiencies. Then, the program directors were requested to estimate how many, if any, of the residents with skill deficiencies had improved significantly during the first four months of residency training. The residents were divided into the following three categories: IMGs without the CSA, IMGs with the CSA, and USMGs (residents who graduated from LCME-accredited medical schools regardless of citizenship or visa status). Graduates of international medical schools included both USIMGs (residents who graduated from non-LCME accredited medical schools but who were U.S. citizens while at medical school), and IMGs (residents who graduated from non-LCME accredited medical schools and who were not U.S. citizens regardless of current visa status).
Of the 540 surveys mailed out, 389 were returned, resulting in a response rate of 72%. The 389 returned surveys represent 3,727 PGY1 residents for the 1999–2000 academic year. The first mailing of the survey yielded 236 responses. The reminder postcard and second mailing of the survey resulted in an additional 153 responses. Characteristics of respondents and non-respondents were compared to determine whether there were differences between those programs that provided responses to the survey and those that did not. Program specialties (family practice, internal medicine, obstetrics and gynecology, psychiatry), regions of the United States (Northeast, Midwest, South, West), program types (university-based, community-based, other), and program sizes (numbers of graduate year 1 residents) were compared.
Response rates by program specialty were similar (chi square = 1.59, p = .81). Response rates did not vary as a function of geographic area (chi square = .65, p = .88) or program type (chi square = 1.07, p = 0.59). Finally, there was no relationship between response rate and program composition (i.e., percentage of international medical graduates, number of PGY1 residents). The mean percentages of IMGs were 64% and 67% for non-respondents and respondents, respectively. Based on these analyses it would appear that any effects of response bias would be minimal.
The results of the survey questions regarding skill deficiencies are summarized in Table 1. For each skill domain, results reported in the “D” row are percentages of residents with deficiencies out of the total number of PGY1 residents in each group for the 1999–2000 academic year. The percentages of deficient residents who improved within four months of training are reported in the “I” row.
Program directors indicated that for every skill domain, a greater percentage of IMGs without the CSA exhibited deficiencies compared with IMGs with the CSA. The largest difference (4.0%) was in the domain of interpersonal skills (effect size = 0.17). The smallest difference (1.4%) was based on English proficiency (effect size = 0.06). For all the skill domains listed, significantly fewer USMGs, as a percentage of the resident population, had deficiencies compared with IMGs, either with or without the CSA (p < .05). Despite the differences in percentages of deficient residents, rates of improvement were consistent amongst the three cohorts, with about half to two thirds of residents with deficiencies tending to improve within four months of training.
Across skill domains, program directors reported that residents were most likely to be deficient in the skills of diagnosis and management, history taking, physical examination, and interpersonal skills. They were least likely to be deficient in ethical behavior, written communication, and English proficiency. The likelihood of improvement was approximately the same for each skill domain. After four months of training, about half to two thirds of deficient residents improved enough to be considered acceptable by program directors.
The recent introduction of the CSA as an ECFMG certification requirement provided a unique opportunity to investigate the consequential impact of demanding demonstrable clinical skills performance prior to entering graduate medical education programs. Since ECFMG certification is valid indefinitely,* there will be both individuals with and those without CSA entering residency training positions for some time. Eventually, all IMGs entering PGY1 positions will have been required to pass the CSA. Until this point, comparisons between individuals certified prior to the introduction of the CSA and those certified afterwards can be made.
The results of this survey provide additional evidence for the validity of the CSA. The comparison of residents with and without the CSA as part of their certification requirements suggests that the introduction of the performance-based standardized patient assessment has had a positive effect on the quality of PGY1 IMG residents. Across all skill domains, the residents with CSA were judged by program directors to be less likely to have entered training with deficiencies. The largest difference was in the area of interpersonal skills, where only 6.5% of those individuals certified by ECFMG with the CSA were judged to be deficient. In contrast, over 10% of the IMGs without the CSA were categorized as deficient. It should be remembered, however, that the CSA/non-CSA cohorts are likely not to be directly comparable in terms of background characteristics, some of which may have been relevant to the residency selection decision. For instance, those individuals certified with the CSA would tend to be younger and to have graduated from medical school more recently. Therefore, the introduction of the CSA may not be the only factor that accounts for differences in the percentages of deficient IMG residents. Nevertheless, given the high-stakes nature of the assessment, it would be expected that those individuals who were required to take the CSA would prepare for the assessment and therefore, on average, become more clinically proficient. This prospect, combined with the data presented in this paper, would support the consequential validity of the CSA.
While there were clear differences between the CSA and non-CSA cohorts in terms of judged clinical deficiencies, it remains that some residents with passing scores on the CSA may still lack adequate clinical skills. Ultimately, if the education and screening of IMGs were ideal, one would expect that program directors would report next to no deficiencies. However, given the variability in medical education programs around the world, the less-than-perfect precision of CSA scores, and the difficulty in measuring some relevant domains (e.g., ethics), there will always be a chance that some individuals will enter graduate medical education programs with skill deficiencies. From a criterion-referenced perspective, this suggests that some individuals certified by the ECFMG, regardless of passing status on the CSA, may not possess adequate clinical skills. Alternatively, program directors probably are not using the ECFMG's definition of readiness to enter graduate medical education in ascertaining who is and who is not deficient. It may well be that their expectations of clinical performance are inflated. The fact that significant proportions of USMG residents were also judged to be deficient supports this hypothesis. Nevertheless, it should be noted that the survey was based on residents who entered graduate programs in 1999. In 1999–2000 the ECFMG conducted a number of standards-validation studies. These studies resulted in new performance standards being implemented, effective February 1, 2000. Given that the new performance standards are somewhat higher than the old ones, it would be expected that fewer IMGs with passing CSA scores would now be judged to be deficient. A follow-up survey of program directors is currently being conducted to investigate this possibility.
The comparison of the percentages of USMG and IMG skill deficiencies also yielded some interesting results. Overall, USMGs were much less likely to be judged by program directors to be deficient. Across all skills categories less than 5% of the USMGs were categorized as deficient. Nevertheless, this would still represent a significant proportion of the USMG PGY1 class, and provides some evidence supporting the need for clinical skills assessment in this cohort. The present survey was, however, targeted at residency programs with a minimum of 25% IMGs. Therefore, the representativeness of the USMG sample is questionable. If the USMG sample surveyed as part of this study was of lower ability than the general population, then the percentages of residents categorized as being deficient would likely be overestimates.
Although some PGY1 residents, regardless of country of medical training, were judged to be deficient in one of several clinical skills domains, a large proportion of these individuals improved enough within four months to be considered acceptable. This finding was encouraging in that, regardless of the screening mechanisms (e.g., certification requirements, program selection criteria), there is always the possibility of accepting for graduate training an individual who lacks the requisite skills. Historically, some percentage of resident attrition can be attributed to biomedical knowledge or clinical deficiencies.10,11 Based on the summary data from this investigation it would appear that, at least for the programs that were surveyed, residency training is effective in helping those individuals with notable clinical deficiencies to improve.
Although the survey results presented here provide some additional evidence for the utility and validity of the CSA, there are several limitations associated with the survey method. First, the program directors were asked for a retrospective rating of their interns' initial proficiency. Here, judgments regarding deficiency may have been swayed by relative improvements in the proficiencies of certain individuals. Second, for most IMG residents, program directors would know whether the CSA had been part of the ECFMG certification process. Although actual CSA scores are not reported on transcripts, the potential for biased judgments based on perceived clinical skills proficiency certainly exists. Finally, the program directors were not asked to provide individual data, only summary categorizations based on the defined cohorts. From a design perspective, the ability to identify specific individuals and link their ratings back to actual CSA performances would certainly be more powerful and informative. Nevertheless, based on the programs surveyed and group-level categorizations, individuals with the CSA as part of their ECFMG certification were less likely to be judged to have clinical skills deficiencies at the beginning of their PGY1 year. Follow-up surveys will be useful to further substantiate the long-term positive impact of requiring demonstrable clinical skills proficiency prior to entering graduate medical education training programs in the United States.
1. ECFMG. Clinical Skills Assessment (CSA) Candidate Orientation Manual. Philadelphia, PA: Educational Commission for Foreign Medical Graduates, 1999.
2. Swanson DB, Case SM, Ripkey D, Melnick DE, Bowles LT, Gary NE. Performance of examinees from foreign schools on the basic science component of United States Medical Licensing Examination. Adv Med Educ. 1997;187–90.
3. Ripkey D, Case SM, Swanson DB, Melnick DE, Bowles LT, Gary NE. Performance of examinees from foreign schools on the clinical science component of the United States Medical Licensing Examination. Adv Med Educ. 1997;175–8.
4. Wainer H, Lukhele R. How reliable are TOEFL scores? Educ Psychol Meas. 1997;57:741–58.
5. Brown JD. The relative importance of persons, items, subtests and languages to TOEFL test variance. Language Testing. 1999;16:239–41.
6. Vu NV, Barrows HS. Use of standardized patients in clinical assessments: recent developments and measurement findings. Educ Res. 1994;23(3):23–30.
7. Norcini JJ, Stillman PL, Sutnick AI, et al. Scoring and standard setting with standardized patients. Eval Health Prof. 1993;16:322–32.
8. Swanson DB, Clauser BE, Case SM. Clinical skills assessment with standardized patients in high-stakes tests: a framework for thinking about score precision, equating, and security. Adv Health Sci Educ. 1999;4:67–106.
9. Ziv A, Boulet JR, Burdick WP, Friedman Ben-David M, Gary NE. The use of national medical care surveys to develop and validate test content for standardized patient examinations. In: Melnick D. (ed). Proceedings of the Eighth Ottawa Conference on Medical Education and Assessment. Philadelphia, PA: National Board of Medical Examiners, 2000;1:99–105.
10. Laufenburg H, Turkal N, Baumgardner D. Resident attrition from family practice residencies: United States versus international medical graduates. Fam Med. 1994;26:614–7.
11. Baldwin D. Casualties of residency training: a national study of loss and attrition. In: Proceedings from the 27th Annual Conference on Research in Medical Education. Washington, DC: Association of American Medical Colleges, 1988:112–7.