The resident selection process could be improved if United States Medical Licensing Examination (USMLE) scores obtained during residency application were found to predict success on the American Board of Anesthesiology (ABA) written examination (part 1) or other measures of medical knowledge competency. This correlation between USMLE scores or its predecessor, National Board of Medical Examiners (NBME) scores, and residency standardized examination performance has been made for specialties including internal medicine (IM),1 dermatology,2 orthopedic surgery,2 and physical medicine,2 among others.3–15 For the purpose of this study, we use Mosby's definition of standardized examinations as any empirically developed examination with established reliability and validity as determined by repeated evaluation of the method and results.16
To our knowledge, no study has directly correlated USMLE results with ABA written examination (part 1) scores. A recent publication17 has shown that the ABA/American Society of Anesthesiologists (ASA) In-Training Examination (ITE), taken at the end of the first clinical anesthesia year, serves as a strong predictor of success on the ABA written (part 1) and oral (part 2) board examinations, but this does not aid in resident selection because the results are not available until Postgraduate Year 2 (PGY-2).
We hypothesized that USMLE scores obtained before residency would correlate significantly with ABA/ASA ITE scores as well as ABA written board examination (part 1) performance. The objectives of this study were to compare USMLE performance during medical school to anesthesiology residency standardized examination performance, as well as review the literature for evidence of equally powerful pre-residency predictors of residency standardized examination performance.
In October of 2009, after expedited IRB approval and notification to all graduates and current residents in the program, the records of 76 current and former residents of the Anesthesiology program at Wayne State University School of Medicine were reviewed retrospectively for demographics, USMLE scores, Anesthesia Knowledge Test (AKT) ranking, IM ITE percentile ranking, ABA/ASA ITE percentile score, and first-time ABA written board examination scores (CA-3 ABA/ASA ITE results). The records were reviewed from the residency's graduating class of 2002 to the class of 2009 (Table 1). One resident was excluded from the study because of unavailable USMLE scores, and another because of leaving the residency program for nonacademic reasons. Five Doctors of Osteopathy were also excluded because they did not take the USMLE, but they had taken the Comprehensive Osteopathic Medical Licensing Examination. The remaining 69 residents comprise the final study group. The USMLE step 1, USMLE step 2 Clinical Knowledge (CK), and average of both USMLE 3-digit scores were used as pre-residency predictors. The residents' gender, class year of graduation, and location of the residents' school of medicine were also recorded (Table 1). Medical school grades were not studied, nor uniformly available and comparable, because of the high percentage (68% of the sample) of international graduates. Age was also not examined.
The USMLE step 1 and 2 CK are standardized examinations taken after the second and third years of medical school, respectively. From the USMLE web site, “Step 1 assesses whether you understand and can apply important, basic concepts of the sciences to the practice of medicine, with special emphasis on principles and mechanisms underlying health, disease, and modes of therapy.” Whereas, “Step 2 assesses whether you can apply medical knowledge, skills, and understanding of clinical science essential for the provision of patient care under supervision and includes emphasis on health promotion and disease prevention” (Table 2, Ref. 1). The step 1 examination comprises 336 questions spread over 8 hours in a single day, whereas the step 2 CK has 352 questions spread over 9 hours. Three-digit scores mostly range from 140 to 260 with a standard deviation of approximately 20. Exact minimum and maximum scores are not published. The current minimum passing 3-digit scores for the USMLE step 1 and step 2 CK examinations are 188 and 184, respectively, which currently correlate to a 94% and 96%, respectively, first-time pass rate for United States medical doctor (Table 2, Ref. 2). The minimum passing 3-digit score fluctuates with time, and 3-digit scores from 1 year are approximately comparable with 3-digit scores from other years. For the 2009 match, the average 3-digit USMLE step 1 and step 2 CK scores for matched United States applicants to anesthesiology residencies (class of 2014) were 224 and 230, respectively (Table 2, Ref. 2).
Our intra-residency outcome measures studied were the AKT-6 and AKT-18, taken 6 and 18 months, respectively, after the start of anesthesiology training, PGY-1 IM ITE taken after the general clinical base year, Clinical Anesthesia Year 1 (CA-1) ABA/ASA ITE and CA-2 ABA/ASA ITE taken after the first and second years of anesthesiology training, respectively, as well as the ABA written board examination taken in CA-3. The AKTs are standardized tests that attempt to gauge residents' knowledge of anesthesiology. ABA/ASA ITEs gauge similar information and are also used as the ABA written board examination. All examinations are graded on percentile scales relative to all residents taking the examination that year.
Not all outcome values were available for all subjects. Two residents from the classes of 2005 and 2007 were not able to take the AKT-6 examination because of maternity leave. The AKT-18 examination was not regularly administered until the class of 2005, and 2 residents from the classes of 2008 and 2009 were on vacation at the time of administration. The PGY-1 IM ITE was not regularly administered until the class of 2004. A resident from the class of 2002 missed his CA-1 ABA/ASA ITE because of a leave of absence. The class of 2009 CA-2 ABA/ASA ITE scores were not available at the time of data analysis, and 1 resident from the class of 2006 missed the examination because of a leave of absence. ABA written board examination scores are no longer released to anesthesiology programs, starting with the class of 2008. Three ABA written board examination scores from the class of 2002, 2 from the class of 2003, and 1 from the class of 2004 are also unavailable because of variable reporting of actual score to the training program. Descriptive statistics of outcome measures are reported in Table 3.
Linear regression of each intra-residency outcome measure was performed against pre-residency predictors (USMLE 1 and 2 CK individually and average USMLE 1 and 2 CK) as well as class year (year of resident graduation). Class year was used as a proxy for improvements in resident teaching from year to year, as well as improvements in resident recruitment over time, not accounted for by USMLE score. The written ABA board examination score was also correlated with AKT and ABA/ASA ITE scores to determine the best intra-residency predictor for ABA written board performance. Additionally, multiple regressions were performed for all outcome measures versus the ABA written board, with the goal of determining whether a combination of intra-residency factors could better predict score on the ABA written examination. Condition Index variance was used to look for collinearity. Pearson partial correlations were performed for each outcome measure as well, adjusting for class year and school of medicine location (as depicted in Table 1). Class year correlations were also adjusted, but for the most significant USMLE predictor instead. Normality was confirmed by histogram and probability plot of residuals. A list of analyses is presented with results in Table 4. Simple 1-way analysis of variance was also performed to look for significant differences in average outcome measures with respect to school of medicine location. All analyses were performed using SPSS version 16.0.1 software (SPSS Inc., Chicago, IL).
Linear regressions of each intra-residency outcome measure on each pre-residency predictive measure are presented in Table 4. Regressions of the control variable class year are included in each outcome measure section. Both parts of the USMLE individually, and the USMLE step 1 and step 2 CK averaged scores, significantly correlated to all outcome measures (P < 0.01). USMLE step 1 had the highest sample correlation coefficients for most of the ABA (r = 0.50–0.60) and AKT examinations (r = 0.48–0.57). USMLE step 2 CK showed the highest sample correlation coefficient for PGY-1 IM ITE (r = 0.62). USMLE average also correlated significantly for all outcome measures (sample r = 0.43–0.65) with the highest sample correlation coefficient for second-year ITE correlations. Although not statistically significant, the slopes of many correlations were highest for the USMLE average correlations. These correlations also had consistently high correlation coefficients relative to the USMLE steps individually. The USMLE step 1 had both the highest sample correlation coefficient and steepest slope for ABA written board examination correlations. For the CA-2 ABA/ASA ITE, a 20-point, 3-digit score increase in USMLE average predicts a 20-percentile point increase on ITE score. A lower but significant increase of 15 percentile points on the ABA written board examination is predicted by a 20-point increase on the USMLE step 1 (Fig. 1).
A stronger prediction of ABA written board examination performance was found using intra-residency predictors (Table 5). The CA-2 ABA/ASA ITE ABA written board examination correlation returned the highest sample r of 0.77 and a corresponding slope of 0.76. Addition of the AKT-6 to the model using multiple regression increased the sample correlation coefficient to 0.81, with CA-2 ITE and AKT slopes of 0.59 and 0.32, respectively. There are no statistically significant differences between the slopes of these comparisons.
To account for improvements in residency teaching and residency selection not related to USMLE score, class year was analyzed as a separate predictor. Significant correlations were found with most outcome measures, but after the Pearson partial correlations were adjusted for the best USMLE correlate, all such correlations were insignificant (Table 4).
Our results show that both step 1 and step 2 CK USMLE scores individually, and step 1 and step 2 CK averaged USMLE scores, correlated significantly with their first ABA written part 1 score for CA-3 anesthesiology residents. We conclude that one could use pre-residency USMLE scores in resident selection for anesthesiology training if one wished to improve a program's average examination performance. We found that changing our own residency selection process to include a soft cutoff for ranking for interview at an averaged USMLE score of 215 has increased our program's average CA-2 ABA/ASA ITE score for the 2007 and 2008 classes to 65%, considerably above the 43% average score seen with our 2003 and 2004 classes. The 2007 and 2008 classes had a mean USMLE average score of 219, whereas the 2003 and 2004 classes averaged 194 on those standardized tests. The 2007 and 2008 classes all passed the ABA written part 1 examination whereas only 50% of our 2003 and 2004 classes passed this examination on their first attempt.
Our results confirm and extend the findings of McClintock and Gravlee,17 showing that intra-residency test results most strongly predict written ABA board examination performance. Multiple regression using CA-2 ABA/ASA ITE and AKT-6 results together gave our highest correlation coefficient of 0.81. However, these authors17 did not consider pre-residency test results in their analyses and did not address residency selection.
The PGY-1 IM ITE was added to our program in 2004 and discontinued in 2008. The goal was to assess PGY-1 efficacy in instilling knowledge of IM principles in our graduate students' minds. The correlation of USMLE scores with PGY-1 IM ITE scores is similar to the correlation of USMLE to ABA ITE. We also found no evidence in our literature review that USMLE is superior in predicting general medicine examination performance than specialty performance. We believe this furthers the point that the USMLE is not simply a static picture of general medical knowledge, but that the general medicine examination USMLE predicts ability to learn specialty-specific medical knowledge and reproduce that knowledge on future examinations.
We also examined the literature to determine whether there were any possible substitutes for USMLE results as predictors of future residency standardized test performance or medical knowledge competency. For purposes of review, we considered ITE performance as well as board examination performance as end-point measures of medical knowledge competency. The Accreditation Council for Graduate Medical Education (ACGME) considers the ABA part 1 (written) and part 2 (oral) of such importance that one of their few current explicit requirements is for 70% of every anesthesiology program's residents to pass both parts of the ABA board (Table 2, Ref. 3).
The relationship between undergraduate college education and medical school performance is applicable to this comparison. A rigorous study of outcomes of 4076 medical students has shown that the Medical College Admission Test predicts medical school grades and USMLE scores with more accuracy than college grades alone.18 This was confirmed in other graduate specialties in a review of meta-analyses by Kuncel and Hezlett,19 published in Science. The study also found that standardized tests do not demonstrate bias and are not damaged by test coaching. It is not surprising that USMLE's correlations with residency performance mirror the Medical College Admission Test's correlation with medical school performance.
Large studies in IM (n = 117, r = 0.57),1 dermatology (n = 204, r = 0.48–0.49), orthopedic surgery (n = 623, r = 0.59–0.68), and physical medicine (n = 252, r = 0.63–0.68) have noted moderate to strong correlations between NBME (precursor to USMLE) and specialty board performance.2 Although the examination has been changed significantly since these studies, there is a strong correlation between NBME and USMLE. It is reasonable to assume that these correlations are still valid.4,20 Since then, 5 smaller studies were found that have correlated USMLE to board performance in radiology (n = 77, r = 0.82),5 orthopedic surgery (n = 64, r = 0.38),6 (n = 46, r = 0.59),21 general surgery (n = 26, r = 0.82),22 and pediatrics (n = 70, r = 0.67).7 Some studies did not find a correlation, but they were either of low power (orthopedic surgery, n = 36)23 or did not perform a linear regression with actual score (radiology,9 orthopedic surgery10).
Other researchers have suggested that medical school grades, subjective clinical evaluations, or Alpha Omega Alpha (AOA) honor society membership could be more reliable for predicting board performance than USMLE results, but there is significant evidence to dispute this. Studies that correlated clinical or preclinical grades found a lower but positive correlation with specialty board performance compared with USMLE.5,22 Other studies show that AOA status,5,22,24 medical school prestige,5 subjective faculty evaluations,12,21,24,25 class rank,26 dean's letter,5 and interview score22 do not correlate with board or ITE performance. One large study showed a small significant increase in board performance with AOA, but USMLE score was not analyzed concurrently.13 Another showed a weak AOA correlation, but it was much weaker than found with USMLE or third-year medical school grades.10 Reference letter evaluation was shown to correlate with board performance in 1 small study, but did not correlate with ITE performance in the same study.22 An earlier larger study found no significant correlation.5 A review article of methods predicting future resident performance concluded “letters of recommendations, clerkships, and prior experiences show poor correlation with future residency success.”14 Although USMLE, and to a lesser extent unstandardized clinical grades, correlate well with residency standardized test performance, other factors such as motivation, self study, and structured didactics likely are equally important.15,27
We realize that standardized tests are not the only measures of resident performance during residency. To that end, we also reviewed the literature for comparisons of USMLE to residency global faculty evaluation performance, one of the most frequently used evaluation formats. These comparisons found inconclusive results.3,5,22,28–31 The uncertain results could be attributable to the low predictive power of the USMLEs or it could reflect the dubious value of unstandardized global faculty evaluations. An emergency medicine literature review of clerkship evaluations noted this and concluded that limited information is gained from clinical grades of outside institutions.14 Another study of global faculty evaluations of residents by radiologists found poor prediction for performance on the American College of Radiologists ITE.32 The ACGME agrees that global (faculty) ratings may be highly subjective, biased, unreliable, and not reproducible without training (Table 2, Ref. 4). The American Association of Medical Colleges and the ACGME need to address the lack of standardized, graded measures of medical student and resident performance in areas other than medical knowledge. It is our opinion that standardized testing in these areas, if started in medical school, would help to drive the process of transformation of the nation's health care system as currently under debate in Washington.33
The limitations of this study include its retrospective nature, a moderate sample size, missing data for some outcome measures, and a high percentage of international graduates. We have not specifically examined the effect of prior anesthesiology training on the outcomes, but a number of our IM graduates have had prior training. This may be a confounder. We also rely on literature review for our conclusions related to grades and other non-USMLE predictors of residency performance, which we did not test in this cohort.
The USMLE is a nonbiased, valid, and reliable measure of medical student knowledge. As shown here, it is also a moderate to strong predictor of performance on residency ITE and written board examinations for anesthesiology. Literature review shows it is also likely to be the most powerful predictor of future performance on residency standardized tests for many other specialties. Residency programs can likely increase their average performance on board examinations by increasing the importance of USMLE step 1 and 2 scores in resident selection. However, we do not advocate choosing residents solely based on USMLE. Although untested by us, USMLE is likely not as useful for prediction of performance other than medical knowledge competency. Larger studies would be beneficial to better discriminate other useful predictors of residency success.
The authors acknowledge Kirk A. Easley, MS, for assistance with statistical analysis and thank Franklin Dexter, MD, PhD, and the reviewers for substantive constructive criticism.
1. Sosenko J, Stekel KW, Soto R, Gelbard M. NBME examination part I as a predictor of clinical and ABIM certifying examination performances. J Gen Intern Med 1993;8:86–8
2. Case SM, Swanson DB. Validity of NBME part I and part II scores for selection of residents in orthopaedic surgery, dermatology, and preventive medicine. Acad Med 1993;68:S51–6
3. Berner ES, Brooks CM, Erdmann JB. Use of the USMLE to select residents. Acad Med 1993;68:753–9
4. Becker DF, Swanson DB, Case SM, Nungester RJ. Results of the initial administrations of the NBME comprehensive part I and part II examinations. Acad Med 1992;67:S16–8
5. Boyse TD, Patterson SK, Cohan RH, Korobkin M, Fitzgerald JT, Oh MS, Gross BH, Quint DJ. Does medical school performance predict radiology resident performance? Acad Radiol 2002;9:437–45
6. Klein GR, Austin MS, Randolph S, Sharkey PF, Hilibrand AS. Passing the boards: can USMLE and orthopaedic in-training examination scores predict passage of the ABOS part-I examination? J Bone Joint Surg Am 2004;86-A:1092–5
7. McCaskill QE, Kirk JJ, Barata DM, Wludyka PS, Zenni EA, Chiu TT. USMLE step 1 scores as a significant predictor of future board passage in pediatrics. Ambul Pediatr 2007;7:192–5
8. Rifkin WD, Rifkin A. Correlation between housestaff performance on the United States medical licensing examination and standardized patient encounters. Mt Sinai J Med 2005;72:47–9
9. Gunderman RB, Jackson VP. Are NBME examination scores useful in selecting radiology residency candidates? Acad Radiol 2000;7:603–6
10. Turner NS, Shaughnessy WJ, Berg EJ, Larson DR, Hanssen AD. A quantitative composite scoring tool for orthopaedic residency screening and selection. Clin Orthop Relat Res 2006;449:50–5
11. Carmichael KD, Westmoreland JB, Thomas JA, Patterson RM. Relation of residency selection factors to subsequent orthopaedic in-training examination performance. South Med J 2005;98:528–32
12. Wade TP, Andrus CH, Kaminski DL. Evaluations of surgery resident performance correlate with success in board examinations. Surgery 1993;113:644–8
13. Amos DE, Massagli TL. Medical school achievements as predictors of performance in a physical medicine and rehabilitation residency. Acad Med 1996;71:678–80
14. Balentine J, Gaeta T, Spevack T. Evaluating applicants to emergency medicine residency programs. J Emerg Med 1999;17:131–4
15. Godellas CV, Huang R. Factors affecting performance on the American Board of Surgery in-training examination. Am J Surg 2001;181:294–6
16. Mosby. Mosby's Medical Dictionary. 8th ed. St. Louis: Mosby, 2009
17. McClintock JC, Gravlee GP. Predicting success on the certification examinations of the American Board of Anesthesiology. Anesthesiology 2010;112:212–9
18. Julian ER. Validity of the Medical College Admission Test for predicting medical school performance. Acad Med 2005;80:910–7
19. Kuncel NR, Hezlett SA. Assessment: standardized tests predict graduate students' success. Science 2007;315:1080–1
20. Case SM, Becker DF, Swanson DB. Relationship between scores on NBME basic science subject tests and the first administration of the newly designed NBME part I examination. Acad Med 1992;67:S13–5
21. Thordarson DB, Ebramzadeh E, Sangiorgio SN, Schnall SB, Patzakis MJ. Resident selection: how we are doing and why? Clin Orthop Relat Res 2007;459:255–9
22. Brothers TE, Wetherholt S. Importance of the faculty interview during the resident application process. J Surg Educ 2007;64:378–85
23. Dirschl DR, Campion ER, Gilliam K. Resident selection and predictors of performance: can we be evidence based? Clin Orthop Relat Res 2006;449:44–9
24. Schwartz RW, Donnelly MB, Sloan DA, Johnson SB, Strodel WE. Assessing senior residents' knowledge and performance: an integrated evaluation program. Surgery 1994;116:634–7
25. Hawkins RE, Sumption KF, Gaglione MM, Holmboe ES. The in-training examination in internal medicine: resident perceptions and lack of correlation between resident scores and faculty predictions of resident performance. Am J Med 1999;106:206–10
26. Papp KK, Polk HC Jr, Richardson JD. The relationship between criteria used to select residents and performance during residency. Am J Surg 1997;173:326–9
27. Philip J, Whitten CW, Johnston WE. Independent study and performance on the anesthesiology in-training examination. J Clin Anesth 2006;18:471–3
28. Daly KA, Levine SC, Adams GL. Predictors for resident success in otolaryngology. J Am Coll Surg 2006;202:649–54
29. Borowitz SM, Saulsbury FT, Wilson WG. Information collected during the residency match process does not predict clinical performance. Arch Pediatr Adolesc Med 2000;154:256–60
30. Bell JG, Kanellitsas I, Shaffer L. Selection of obstetrics and gynecology residents on the basis of medical school performance. Am J Obstet Gynecol 2002;186:1091–4
31. Smith SR. Correlations between graduates' performances as first-year residents and their performances as medical students. Acad Med 1993;68:633–4
32. Wisea S, Stagga PL, Szucsc R, Gayd S, Maugerb D, Hartmana D. Assessment of resident knowledge: subjective assessment versus performance on the ACR in-training examination. Acad Radiol 1999;6:66–71
33. Marsh HM. Transforming Healthcare. Detroit Medical News, Vol. XCIX, No. 5:6–9, 2009
RCG and HMM helped with study design, conduct of study, data collection, data analysis, and manuscript preparation; KR and EJC helped with conduct of study and manuscript preparation.