Throughout the course of medical education and training, aspiring physicians’ application of knowledge is tested multiple times. The Medical College Admission Test (MCAT) is required for admission to all U.S. MD-granting medical schools. Undergraduate grade point average (GPA), both overall and in the sciences, is also crucial for medical school admission. Following medical school matriculation, students must pass 3 successive steps of the United States Medical Licensing Examination (USMLE) to obtain a full medical license. Step 1 is often completed before beginning the third year of medical school, and the 2 parts of Step 2 are often completed in the fourth year. For international medical graduates (IMGs), passing Step 1 and both parts of Step 2 are requirements for certification by the Educational Commission for Foreign Medical Graduates (ECFMG), which is necessary for entrance into Accreditation Council for Graduate Medical Education (ACGME)-accredited residency programs. Step 3 is usually taken early in residency but can be completed any time before independent practice. These scores, except Step 3 scores, form the core of the objective standardized assessments available to residency directors when making decisions regarding whom to accept into their programs. Despite these examinations being required components of the medical education landscape for decades, it remains unknown which are the best predictors of board certification examination performance and eventual board certification. Family medicine residency program directors are especially concerned about their potential residents’ ability to pass the American Board of Family Medicine (ABFM) Family Medicine Certification Examination as, during the time of our study (2008–2012), the ACGME required that family medicine residencies have a rolling 5-year 90% ABFM certification examination pass rate and a rolling 5-year 95% take rate.1 Having a low certification examination pass rate is one of the most common citations received by family medicine residencies2 and may be seen as a marker of low-quality training.
Most prior studies of the relationships between standardized medical education assessments have examined the relationships between the USMLE Step and board certification examinations. (Throughout this paper, the term medical education assessments, or simply assessments, includes undergraduate GPA, the MCAT exam, the USMLE Step examinations, and the ABFM in-training examinations [ITEs].) Many also include scores from ITEs, which are either offered by a relevant specialty society or certifying board and are taken each year during residency. Most of these studies were based on relatively few participants from single institutions and do not include all medical education assessments.
We identified one study that included the relationship between the MCAT and board certification in any specialty. Among 1,155 graduates of the Uniformed Services University of the Health Sciences medical school from 1995 to 2002, the investigators found weak correlations (r2 = −0.06 to 0.12) between premedical and medical school evaluations and board certification results.3 Adjusted analyses were performed, but the authors only stated that “the correlation between [the] MCAT and Step 1 and Step 2 were significant and meaningful.”3 There was no significant difference in MCAT scores between groups that obtained and did not obtain board certification.
Step 1 and Step 2 Clinical Knowledge (CK) scores have been shown to correlate with ITE scores and are often used by program directors as data points in selecting residents. In a small single-institution study, Step 2 CK scores had a stronger correlation (r2 = 0.70 to 0.79) to internal medicine ITE scores than Step 1 and Step 3 scores (r2 = 0.37 to 0.55).4 Lower correlations (r2 = 0.16 to 0.44) were found between Step 1 and Step 2 CK and ITE scores in a single-institution study of emergency medicine residents.5 Another study found that Step 1 scores were correlated with dermatology ITE performance (r2 = 0.47 to 0.54).6 In adjusted analyses from a single institution, Step 1 (β = 0.19), Step 2 CK (β = 0.23), and Step 3 (β = 0.19) were all independently predictive of internal medicine ITE scores.7 A survey of 86 physical medicine and rehabilitation residents found that performance on the ITE was more strongly associated (r2 = 0.58) with board certification examination score than passing each Step examination on the first attempt (r2 = 0.30), but no adjusted analyses were performed.8
USMLE scores have also been shown to predict board certification examination scores in multiple specialties. For example, in anesthesiology, USMLE scores predicted certification examination performance in recent graduates from 2002 to 2007, with correlations ranging from 0.53 to 0.59 for the 3 steps.9 Similarly, at 4 residency programs, USMLE and ITE scores were both correlated (r2 = 0.50 and 0.53) with orthopedic certification examination performance.10 A study based on a single orthopedic program revealed a stronger correlation between 3 successive years of ITE scores and certification examination performance (r2 = 0.49 to 0.69) than between Step 1 and certification examination performance (r2 = 0.38).11 None of these studies performed adjusted analyses to determine which examination was a better predictor of board certification examination performance. In contrast, a single-institution study of pediatric board certification examination performance found that ITE scores added little new information than what could be predicted by Step 1 scores in an adjusted analysis.12 However, another single-institution study found that, among internal medicine residents, both USMLE and ITE performances were independently correlated with certification examination performance.13 A single-institution study of general surgery residents found that higher Step 2 CK scores were predictive of passing board certification examinations but that higher Step 1 and ITE scores were not.14 A multi-institutional study found that lower ITE and Step 1 scores independently predicted failing orthopedic surgery certification examination scores.15 Finally, a national study found that higher Step 1 and Step 2 CK scores were both correlated with higher orthopedic board certification examination scores and passing.16
Studies using national cohorts and multiple years of data have demonstrated that ITE scores taken later in residency are better predictors of certification examination scores than ITE scores from early in residency.17,18 One study found that ITE scores were moderately correlated with obstetrics and gynecology board certification performance.19 In adjusted analyses, infectious disease ITE scores were more strongly predictive of infectious disease certification performance than prior internal medicine certification examination, Step 1, or Step 3 scores.20
In family medicine, 2 prior studies have demonstrated the predictive validity of the ABFM ITE to ABFM certification examination performance. Using data from 1986 and 1987, the correlation between ABFM ITE and ABFM certification examination scores and subscores ranged from 0.67 to 0.75.21 A more recent analysis using data from 2010 to 2013 found a correlation of 0.71 between the third-year ITE and certification examination scores and correlations of between 0.67 and 0.75 between sequential ITEs (e.g., between the second- and third-year ITEs).22
A major weakness of the extant literature is the absence of a comprehensive evaluation of the independent relationships of these standardized assessments to board certification examination performance and eventual board certification. Moreover, to our knowledge, no studies have incorporated Step 2 Clinical Skills (CS) scores. Such knowledge may assist program directors in evaluating which residents to accept into their programs and identify residents who may struggle in acquiring the requisite medical knowledge to obtain board certification. Therefore, the objective of our study was to evaluate the associations of all required standardized assessments in medical education with ABFM certification examination scores and eventual ABFM certification. We hypothesized that examinations that are more proximal to the certification examination would be more strongly associated with certification examination scores and eventual certification.
We identified all physicians graduating from U.S. MD-granting family medicine residency programs from 2008 to 2012 using ABFM administrative data. These physicians were then matched via Data Commons to physician identifiers in the Association of American Medical Colleges (AAMC), National Board of Medical Examiners (NBME), Federation of State Medical Boards, and ECFMG databases. After this matching was complete, a unique study ID was created for each physician and sent (along with that organization’s physician identifiers) to the AAMC, NBME, and ABFM, each of which supplied their respective data to one of the authors, using only the unique study ID. We then linked all the data together using the physician’s unique study ID, creating a deidentified dataset for analysis.
ABFM data for each physician included certification examination score and result (i.e., passed or failed), ITE scores, certification status (as of December 31, 2014), degree type, IMG or U.S. medical graduate (USMG) status, and age. From 2008 to 2012, the certification examination was on the same scale and scores were comparable across each year. That is, scores were reported from 200 to 800, with 390 being the cutoff for passing. ABFM board certification is granted when a physician successfully meets 3 criteria: (1) graduation from an ACGME-accredited residency program, (2) obtaining a full and unrestricted medical license, and (3) passing the ABFM certification examination. We chose December 31, 2014, as a cutoff date for certification to allow at least 2 years post residency for each cohort to attempt the ABFM certification examination. The ABFM ITE is administered in the fall of each year of residency and nearly all programs, and all residents within those programs, participate. The ABFM ITE scores are on the same scale as the certification examination and range from 200 to 800. A dummy variable for residency program was included to enable adjustment for program-level effects.
Data provided by the AAMC for each physician included MCAT section scores (see below), undergraduate GPA, gender, U.S. citizenship status (upon entering medical school), and country of medical school. During our study period, the MCAT had 3 multiple-choice sections (biological sciences, physical sciences, and verbal reasoning) that include both passage-based and stand-alone questions. Each section reports results on a scale that ranges from 1 to 15, with the total MCAT score reflecting the sum of these 3 section scores and ranging from 3 to 45. Not considered in these analyses are scores from a writing sample, which was also part of the MCAT during this period. The MCAT is a standardized test that assesses foundational knowledge of science concepts and reasoning skills needed for entry into medical school. The version of the MCAT analyzed in this study measured knowledge of undergraduate introductory-level biology, chemistry, and physics concepts along with information-processing skills.
NBME data for each physician included USMLE scores. The USMLE Step sequence includes 4 separate examinations. Step 1 is a single-day multiple-choice-based test focusing on the basic sciences with scores reported on a 3-digit scale (range: 48–269); passing scores ranged from 185 to 188 during our study. Step 2 CK follows the same general format as Step 1 and assesses the clinical knowledge necessary for supervised patient care. Step 2 CK scores are reported on a 3-digit scale (range: 58–286), with passing scores ranging between 184 and 196 during our study. Step 2 CS is a single-day standardized patient–based examination designed to assess the examinee’s ability to gather information from patients, perform physical examinations, and communicate findings to patients and colleagues. Three separate pass/fail decisions are made based on scores from this examination: (1) integrated clinical encounter (this score assesses the examinee’s ability to perform a focused physical examination, gather information about the patient’s history, and integrate the resulting information into a structured patient note), (2) communication and interpersonal skills, and (3) spoken English proficiency. The Step 2 CS scores reported in this study are derived from the research scale (0–100), which is not shared with the examinee, who only receives a pass or fail decision. During our study period, 56 was the passing cutoff for integrated clinical encounter, 63 for communication and interpersonal skills, and 60 for spoken English proficiency. Finally, Step 3 is a 2-day assessment of the examinee’s ability to apply medical knowledge and understanding of the biomedical and clinical sciences essential for unsupervised practice, which includes multiple-choice items and computer case simulations. Step 3 scores are reported on a 3-digit scale (range: 118–268); during our study, the passing score ranged from between 187 and 190.
When multiple scores were available for the same examination for the same person, we used the results from the initial attempt. This strategy allowed us to focus the study on the outcomes of training and education instead of the outcomes of review or study courses to increase initially low scores. First, we used descriptive statistics to characterize our study cohort. Next, we tested for adjusted associations between assessment scores (i.e., GPA, MCAT, USMLE, and ITE scores) and ABFM certification examination scores using generalized linear regression models. As we hypothesized that assessment results more proximal to the certification examination would be more strongly predictive of board certification examination performance and eventual board certification, we ran successive nested models that started with MCAT score and undergraduate GPA (model 1), then added all USMLE scores (model 2), then all ITE scores (model 3). Each model controlled for age and gender, but due to missing data for study cohorts from the various data sources, we were unable to control for other personal characteristics, such as degree type, race and ethnicity, and citizenship status. Each model also included a dummy variable for residency to correct for the clustering of residents by residency program. Finally, we examined the associations between assessments scores and board certification status (as of December 31, 2014) using generalized logistic regression models that followed the same strategy as described above but that modeled not obtaining board certification using a dichotomous outcome. As many IMGs did not attend a medical school that required the MCAT, we ran separate models for IMGs and USMGs (i.e., we ran all models for USMGs and models 2 and 3 for IMGs). All analyses were conducted using SAS 9.4 (SAS Institute Inc., Cary, North Carolina).
The research protocol for this study was approved by the American Institutes for Research Institutional Review Board.
We identified 15,902 graduates of 429 U.S. MD-granting family medicine residency programs between 2008 and 2012 from ABFM data, of whom 92.1% (14,648/15,902) obtained board certification (Table 1). Of the overall cohort, 39.9% (6,347/15,902) were IMGs, with 35.7% (2,268/6,347) of these being U.S. citizens upon entering medical school. IMGs were older and had a lower undergraduate GPA, lower MCAT scores, lower USMLE scores, and lower ABFM ITE and certification examination scores compared with USMGs. However, a slightly larger proportion of IMGs obtained board certification as compared with USMGs (93.2% [5,916/6,347] vs 91.4% [8,731/9,555]).
In linear regression models with USMGs, when only MCAT scores and GPA were included, MCAT scores were more strongly associated with increased ABFM certification examination scores (β = 6.17 to 8.91) than GPA (β = 0.60; Table 2). Including USMLE scores attenuated the strength of association of MCAT scores with ABFM board scores, leaving only the MCAT physical sciences score with a significant association. As USMLE scores are reported in 3 digits, the associations are small but suggest a stronger association with more recent assessments, as the β coefficient increased from 0.63 to 1.78 from Step 1 to Step 3. Step 2 CS communication and interpersonal skills, spoken English proficiency, and integrated clinical encounter scores were not predictive of ABFM board scores. In models including ITE scores, the strength of associations of more remote assessments generally decreased. For IMGs, a similar pattern held except for Step 1 having a stronger association with ABFM board score than Step 2 CK in both models. Interestingly, better scores on the Step 2 CS spoken English proficiency were negatively associated with ABFM board scores for both USMGs and IMGs in both models with USMLE scores.
In logistic regression models predicting failure to obtain ABFM board certification among USMGs, when only MCAT score and GPA were included, a higher score was generally associated with lower odds of noncertification (Table 3). When USMLE scores were added, only physical sciences remained negatively associated while verbal reasoning became positively associated with failure to obtain certification. Increases in scores on Step 1, Step 2 CK, Step 2 CS integrated clinical encounter, and Step 3 were associated with lower odds of noncertification. When adding ITE scores, only MCAT verbal reasoning score retained significance and Step 1 lost significance; only the ITE score from the third year of residency had an association with noncertification. For IMGs, only Step 2 CK, Step 2 CS integrated clinical encounter, and Step 3 scores were associated with eventual noncertification. Adding ITE scores attenuated these associations, and ITEs from all 3 years of residency were significantly associated with lower odds of noncertification.
USMLE Step 2 CK was predictive of certification examination scores and certification status in all models in which it was included.
Using data from over 15,000 U.S. MD-granting family medicine residents from 5 graduating cohorts (2008–2012) as well as a unique set of data that linked together multiple medical education assessments, we found that more recent assessments generally had stronger associations with ABFM board certification examination scores and eventual board certification than more remote ones. These findings held for both IMGs and USMGs.
Except for parts of Step 2 CS (spoken English proficiency, communication and interpersonal skills) and the 3 MCAT sections, all the other assessments tested application of medical knowledge. While foundational to being a competent physician, medical knowledge is only 1 of 6 core competencies in medical education. It is possible that skills such as systems-based practice, interpersonal and communication skills, practice-based learning and improvement, and professionalism may be more predictive of success in medicine than medical knowledge. Our finding that the few required, standardized assessments that focus on the other competencies generally have little significant correlation with scores on a board certification examination that tests the application of medical knowledge is not surprising. This idea is supported by a recent study of family medicine milestones that found low correlations between ITE scores and milestones assessments in the nonmedical knowledge competencies.23 However, our finding that scores on the integrated clinical encounter component of Step 2 CS, which is designed to measure data gathering and data interpretation skills, were associated with attaining board certification provides some evidence to support the validity of that component. Unfortunately, while potentially useful for selection, program directors do not have access to Step 2 CS scores. Further, passing the ABFM certification examination is not sufficient to obtain certification. A physician must also obtain a full and unrestricted medical license and have graduated from an ACGME-accredited residency program, where the program director attests to the certifying board that the resident has met all core competencies.
While our analysis may aid program directors in resident selection, others have cautioned against using USMLE scores for this purpose.24 Given that our data suggest that assessments of nonmedical knowledge generally have a low correlation with board certification examination performance, one could question whether the board certification examination completely measures physician competency. It should be noted, however, that application of knowledge is fundamental to the practice of medicine and that board certification has been associated with better patient outcomes.25–28 Moreover, other competencies (e.g., interpersonal and communication skills) are assessed in residency programs and are currently being highlighted as part of ongoing changes to board certification examinations.29 The USMLE Step examinations were not designed to predict the likelihood of success in residency but rather to evaluate whether the physician has mastered the minimum medical knowledge and skill necessary for unsupervised practice. While medical knowledge and skill are foundational to the practice of medicine, these alone cannot predict success in residency or in practice.
We used a unique platform, Data Commons, which has the ability to link data from multiple organizations, to conduct our research. This provided us with a way to link data from the premedical experience, medical school experience, residency training, and potentially practice outcomes of cohorts of physicians. To help residencies improve curricula, such data could be linked to additional practice outcomes to evaluate training outcomes.30,31 The ability to link databases from multiple organizations, especially over time, is a necessary requirement for longitudinal investigations of factors associated with quality medical education and practice.
Our study is not without limitations. First, we only assessed the relationships between assessments in one specialty and during a single time period; these may not generalize to other specialties or current graduating cohorts. Second, the MCAT has changed their test blueprint to include more behavioral health, social science, and scientific reasoning concepts. While the MCAT still tests biological science, physical science, and verbal reasoning concepts, this new blueprint may restrict the application of our findings to current residents. Third, the ability to earn a high GPA likely varies greatly by institution and major, which may confound the results for this variable. We also lacked data on physicians’ upbringing (e.g., race and ethnicity, family socioeconomic status), which may impact examination performance.
Using a large, robust dataset on over 15,000 U.S. MD-granting family medicine residency graduates over 5 years (2008–2012), we found that more proximal assessments of medical knowledge better predict ABFM board certification scores and eventual certification than earlier assessments. Thus, solely using medical school admissions (GPA and MCAT) and licensure (USMLE) scores for resident selection may not adequately predict ultimate board certification. Further, adding to the validity argument supporting the certification examination, medical education assessments did not correlate with assessments focused on other core competencies, suggesting that other types of assessments are needed to obtain a full assessment of physician competency.
1. Accreditation Council for Graduate Medical Education. ACGME Program Requirements for Graduate Medical Education in Family Medicine. 2014.Chicago, IL: Accreditation Council for Graduate Medical Education;
2. Carek PJ, Lieh-Lai M, Anthony E. Talk presented at: Program Directors Workshop; March 29, 2015; Kansas City, MO. http://www.acgme.org/Portals/0/PDFs/120_PDW_RPS_2015_FINAL.pdf
. Accessed February 25, 2020.
3. Durning SJ, Dong T, Hemmer PA, et al. Are commonly used premedical school or medical school measures associated with board certification? Mil Med. 2015;180(suppl 4):18–23.
4. Perez JA Jr, Greer S. Correlation of United States Medical Licensing Examination and Internal Medicine In-Training Examination performance. Adv Health Sci Educ Theory Pract. 2009;14:753–758.
5. Thundiyil JG, Modica RF, Silvestri S, Papa L. Do United States Medical Licensing Examination (USMLE) scores predict in-training test performance for emergency medicine residents? J Emerg Med. 2010;38:65–69.
6. Fening K, Vander Horst A, Zirwas M. Correlation of USMLE Step 1 scores with performance on dermatology in-training examinations. J Am Acad Dermatol. 2011;64:102–106.
7. McDonald FS, Zeger SL, Kolars JC. Associations between United States Medical Licensing Examination (USMLE) and Internal Medicine In-Training Examination (IM-ITE) scores. J Gen Intern Med. 2008;23:1016–1019.
8. Fish DE, Radfar-Baublitz L, Choi H, Felsenthal G. Correlation of standardized testing results with success on the 2001 American Board of Physical Medicine and Rehabilitation Part 1 Board Certificate Examination. Am J Phys Med Rehabil. 2003;82:686–691.
9. Dillon GF, Swanson DB, McClintock JC, Gravlee GP. The relationship between the American Board of Anesthesiology Part 1 Certification Examination and the United States Medical Licensing Examination. J Grad Med Educ. 2013;5:276–283.
10. Dougherty PJ, Walter N, Schilling P, Najibi S, Herkowitz H. Do scores of the USMLE Step 1 and OITE correlate with the ABOS Part I certifying examination?: A multicenter study. Clin Orthop Relat Res. 2010;468:2797–2802.
11. Klein GR, Austin MS, Randolph S, Sharkey PF, Hilibrand AS. Passing the Boards: Can USMLE and orthopaedic in-training examination scores predict passage of the ABOS Part-I examination? J Bone Joint Surg Am. 2004;86:1092–1095.
12. McCaskill QE, Kirk JJ, Barata DM, Wludyka PS, Zenni EA, Chiu TT. USMLE Step 1 scores as a significant predictor of future board passage in pediatrics. Ambul Pediatr. 2007;7:192–195.
13. Kay C, Jackson JL, Frank M. The relationship between internal medicine residency graduate performance on the ABIM certifying examination, yearly in-service training examinations, and the USMLE Step 1 examination. Acad Med. 2015;90:100–104.
14. Maker VK, Zahedi MM, Villines D, Maker AV. Can we predict which residents are going to pass/fail the oral boards? J Surg Educ. 2012;69:705–713.
15. de Virgilio C, Yaghoubian A, Kaji A, et al. Predicting performance on the American Board of Surgery qualifying and certifying examinations: A multi-institutional study. Arch Surg. 2010;145:852–856.
16. Swanson DB, Sawhill A, Holtzman KZ, et al. Relationship between performance on Part I of the American Board of Orthopaedic Surgery Certifying Examination and scores on USMLE Steps 1 and 2. Acad Med. 2009;84(10 suppl):S21–S24.
17. Jones AT, Biester TW, Buyske J, Lewis FR, Malangoni MA. Using the American Board of Surgery In-Training Examination to predict board certification: A cautionary study. J Surg Educ. 2014;71:e144–e148.
18. Althouse LA, McGuinness GA. The in-training examination: An analysis of its predictive value on performance on the general pediatrics certification examination. J Pediatr. 2008;153:425–428.
19. Withiam-Leitch M, Olawaiye A. Resident performance on the in-training and board examinations in obstetrics and gynecology: Implications for the ACGME Outcome Project. Teach Learn Med. 2008;20:136–142.
20. Grabovsky I, Hess BJ, Haist SA, et al. The relationship between performance on the Infectious Diseases In-Training and Certification Examinations. Clin Infect Dis. 2015;60:677–683.
21. Leigh TM, Johnson TP, Pisacano NJ. Predictive validity of the American Board of Family Practice In-Training Examination. Acad Med. 1990;65:454–457.
22. O’Neill TR, Li Z, Peabody MR, Lybarger M, Royal K, Puffer JC. The predictive validity of the ABFM’s In-Training Examination. Fam Med. 2015;47:349–356.
23. Mainous AG 3rd, Fang B, Peterson LE. Competency assessment in family medicine residency: Observations, knowledge-based examinations, and advancement. J Grad Med Educ. 2017;9:730–734.
24. Prober CG, Kolars JC, First LR, Melnick DE. A plea to reassess the role of United States Medical Licensing Examination Step 1 scores in residency selection. Acad Med. 2016;91:12–15.
25. Sharp LK, Bashook PG, Lipsky MS, Horowitz SD, Miller SH. Specialty board certification and clinical outcomes: The missing link. Acad Med. 2002;77:534–542.
26. Silber JH, Kennedy SK, Even-Shoshan O, et al. Anesthesiologist board certification and patient outcomes. Anesthesiology. 2002;96:1044–1052.
27. Lipner RS, Hess BJ, Phillips RL Jr.. Specialty board certification in the United States: Issues and evidence. J Contin Educ Health Prof. 2013;33(suppl 1):S20–S35.
28. Reid RO, Friedberg MW, Adams JL, McGlynn EA, Mehrotra A. Associations between physician characteristics and quality of care. Arch Intern Med. 2010;170:1442–1449.
29. Tanaka P, Adriano A, Ngai L, et al. Development of an objective structured clinical examination using the American Board of Anesthesiology content outline for the objective structured clinical examination component of the APPLIED Certification Examination. A A Pract. 2018;11:193–197.
30. Triola MM, Hawkins RE, Skochelak SE. The time is now: Using graduates’ practice data to drive medical education reform. Acad Med. 2018;93:826–828.
31. Peterson L, Carek P, Holmboe E, Puffer J, Warm E, Phillips R. Medical specialty boards can help measure graduate medical education outcomes. Acad Med. 2014;89:840–842.