Correlations Between the USMLE Step Examinations, American College of Physicians In-Training Examination, and ABIM Internal Medicine Certification Examination : Academic Medicine

Secondary Logo

Journal Logo

Research Reports

Correlations Between the USMLE Step Examinations, American College of Physicians In-Training Examination, and ABIM Internal Medicine Certification Examination

McDonald, Furman S. MD, MPH; Jurich, Daniel PhD; Duhigg, Lauren M. MPH; Paniagua, Miguel MD; Chick, Davoren MD; Wells, Margaret; Williams, Amber MSEd; Alguire, Patrick MD

Author Information
doi: 10.1097/ACM.0000000000003382

Abstract

Medical knowledge, while insufficient to fully characterize a physician, is essential for appropriate patient care. Medical knowledge is assessed along the continuum of medical education using nationally standardized tests before medical school with the Medical College Admissions Test,1 during medical school with the United States Medical Licensing Examination (USMLE) Step examinations,2 during graduate medical education (GME) with in-training examinations, and after training with board certification examinations. For internal medicine (IM), the American College of Physicians (ACP) administers the ACP Internal Medicine In-Training Examination (IM-ITE)3 to help residents assess knowledge gaps and to help program directors (PDs) evaluate residency curricula. The IM-ITE is widely used with over 74% of U.S. IM residents taking the examination. For many years, PDs have used the IM-ITE to provide residents with objective feedback on their medical knowledge for personal learning and to identify areas for improvement within residency programs to maximize the chances of all residents passing the American Board of Internal Medicine (ABIM) Internal Medicine Certification Exam (IM-CE),4 which has high stakes for both residents and training programs. For example, although ABIM certification is voluntary, many employers seek board-certified physicians, and income for board-certified physicians is, on average, 18% more than for noncertified physicians.5 Additionally, residency program accreditation is affected by the percentage of their residents who pass the IM-CE.6 More importantly, increasing medical knowledge as assessed by the IM-CE has been associated with better patient-relevant outcomes,7 including improved process of care measures,8–11 decreased cost without decreased quality of care,12 decreased medical licensing board disciplinary actions,13–15 and decreased mortality.16,17

In several IM subspecialties, correlations of subspecialty in-training examinations with ABIM subspecialty certification examinations have been assessed.18–22 This connection with the IM-CE is so important that the IM-ITE “… is modeled after the American Board of Internal Medicine’s certification exam,”3 and PDs have long sought to better understand how IM-ITE performance might predict IM-CE performance. However, most correlation studies have been small and limited to one or a few training programs.23–30 Furthermore, many studies were completed decades ago, while the USMLEs, IM-ITE, and IM-CE have continued to evolve; for example, all of these examinations are now computerized and medical knowledge has advanced considerably.

While the USMLEs and IM-ITE have been studied widely in relation to other training variables31–37 and the IM-CE is associated with physician5 and patient-relevant outcomes,7–17 quantifying the correlations between these assessments has not been done with comprehensive, generalizable, national data. Thus, we sought to assess the correlations between USMLE performance, IM-ITE performance, IM-CE performance, and other medical knowledge and demographic variables, using comprehensive, generalizable, national data.

Method

Participants

Our study included postgraduate year (PGY)-1, PGY-2, and PGY-3 residents, from all Accreditation Council for Graduate Medical Education (ACGME)–accredited IM residency programs, who completed the IM-ITE in the fall of 2014 or the fall of 2015, as the IM-ITE transitioned to computer-based testing in 2014 and most of these residents had time to complete the IM-CE (between 2015 and 2018) before our analyses. Because residents taking the IM-ITE in 2014 might also have taken it in 2015 when they were likely at an advanced PGY level, we treated each PGY level as its own independent analysis. Although percent correct scores are not equated across annual administrations, and thus potentially influenced by changes in form difficulty across the 2 years, the IM-ITE forms are carefully constructed to be equivalent in content and difficulty across years. Supporting the aggregation of data, the logistic relationships between IM-ITE percent correct scores and IM-CE pass/fail status were nearly equivalent for each administration year independently, as shown in Supplemental Digital Appendix 1 (at https://links.lww.com/ACADMED/A892). We excluded residents without complete data for each variable in the study (see below).

Figure 1 presents an overview of the participants and missing data counts. The final sample contained 9,676 PGY-1, 11,424 PGY-2, and 10,239 PGY-3 residents. The PGY-1, PGY-2, and PGY-3 cohorts had IM-CE first-attempt pass rates of 94%, 92%, and 91%, respectively.

F1
Figure 1:
Selection of study participants from among internal medicine postgraduate year (PGY)-1, PGY-2, or PGY-3 residents, from all Accreditation Council for Graduate Medical Education–accredited internal medicine residency programs, who took the American College of Physicians Internal Medicine In-Training Examination (IM-ITE) in 2014 or 2015. Abbreviations: ABIM, American Board of Internal Medicine; IM-CE, Internal Medicine Certification Exam; USMLE, United States Medical Licensing Examination; CK, Clinical Knowledge; MK, medical knowledge.

Data sources

We collected, deidentified, and merged variables from 3 organizations: the National Board of Medical Examiners (i.e., USMLE Step 1, Step 2 Clinical Knowledge [CK], and Step 3 scores), ACP (i.e., IM-ITE scores, gender, and medical school location), and ABIM (medical knowledge milestone ratings, date of birth, and IM-CE scores and pass/fail status).

The USMLE includes 3 steps assessing the ability to apply the knowledge, concepts, and principles that constitute the basis of safe and effective patient care.2 This study included these 3 steps—Step 1, Step 2 CK, and Step 3—all of which provided numeric scores. Examinees typically begin the USMLE examination process in the second or third year of medical school and finish the series during residency. Each examination contains approximately 300 multiple-choice questions (MCQs) administered via a secure computer-based system. Reliability coefficients consistently exceed 0.87 across administrations. Total scores range from 0 to 300, standardized to a mean of 200 and standard deviation (SD) of 20, based on the respective reference group for each examination.2

The IM-ITE is developed by an expert physician committee using the IM-CE blueprint to inform its content. The examination contains 300 single best answer MCQs, with reliability coefficients typically exceeding 0.89 across yearly administrations. Feedback reports are sent to PDs and residents. We used percent correct scores (range: 0–100) for all analyses of IM-ITE scores.

The IM-CE was developed to assess knowledge, diagnostic reasoning, and clinical judgment expected of internists leaving training ready for the unsupervised practice of medicine. This 240-MCQ examination is administered via a secure computer-based format to candidates who have successfully completed all 3 years of IM residency. Overall scores are equated and reported on a standard scale with a mean of 500, SD of 100, and range of 200–800.38 Reliability coefficients consistently meet or exceed 0.89 across administrations.

The ACGME and ABIM IM milestones standardize feedback for residents participating in ACGME-accredited IM residencies, enabling assessment of resident progress toward competency using criterion-based milestones along a developmental continuum.39 Using the milestones framework, scores are assigned on a scale of 1–5 with 9 possible ratings (1, 1.5, 2, … 5). Numeric milestone scores are typically interpreted in 5 categories: 1 = critical deficiencies, 2 = early learner, 3 = demonstrating improvement, 4 = ready for unsupervised practice, and 5 = aspirational. As in Hauer and colleagues, we used the average score of each resident’s 2 medical knowledge milestone ratings—clinical knowledge and knowledge of diagnostic tests and procedures—to assess associations between milestones ratings of medical knowledge and other variables (see below).40

Statistical analyses

We evaluated the strength of associations between USMLE scores, IM-ITE percent correct scores, and continuous IM-CE scores relative to other predictor variables via multiple linear regression. For the full model, we regressed first-attempt IM-CE scores with the following predictor variables: IM-ITE scores; USMLE Step 1, Step 2 CK, and Step 3 scores; averaged medical knowledge milestone ratings; age at IM-ITE; gender; and medical school location (United States or Canada vs international). We retained all variables in the models to evaluate their relative predictive utility. We characterized overall model utility with the coefficient of determination (R2). We used the Pratt index,41,42 which indicates the proportion of the model R2 value uniquely attributable to each individual predictor, to evaluate the relative contribution of each predictor.

We assessed associations of predictor variables with categorical IM-CE pass/fail status using multiple logistic regression, retaining all variables in the model. We evaluated the overall model using Nagelkerke R2 and classification accuracy.

For both regressions, we rescaled typical unstandardized regression coefficients for continuous predictor variables at 0.5 SD to allow comparison of each result given different metrics of these variables.

We conducted multiple logistic regression between IM-ITE scores and IM-CE pass/fail status, focusing on predictive utility to facilitate valid formative interpretations of IM-ITE scores. To reduce inflated accuracy obtained when evaluating model performance on the same sample used to estimate the model, we randomly split the data evenly into training and testing samples for each PGY. The training sample served to estimate logistic regression parameters. We applied these parameters to the testing sample to evaluate classification and predictive accuracy. We evaluated results within the training sample via 4 common classification statistics: specificity, sensitivity, positive predictive value, and negative predictive value.43 We quantified the proportion of residents whose pass/fail status was predicted correctly by the regression model with an overall classification accuracy index.

We conducted all analyses in R version 3.5.0 (The R Foundation, Vienna, Austria).

The institutional review board of the American Institutes for Research approved this study.

Results

Table 1 contains descriptive statistics for variables examined by each PGY. As expected, the IM-ITE scores and medical knowledge milestones ratings increased across PGYs, with the largest increases coming between PGY-1 and PGY-2 for the former and between PGY-2 and PGY-3 for the latter. The PGY-1 and PGY-2 cohorts had slightly higher USMLE scores and performed better on the IM-CE than the PGY-3 cohort.

T1
Table 1:
Descriptive Statistics for Study Variables Based on Population of Residents, From All ACGME-Accredited IM Residency Programs, Who Took the IM-ITE in 2014 or 2015

Table 2 presents results from the multiple linear regression model. Because trends were consistent across the 3 PGYs, we present only PGY-3 results. All variables were statistically significant predictors of IM-CE score. The PGY-3 model explained the most variation (51%, i.e., adjusted R2 = 0.51) in IM-CE scores, as compared with the PGY-2 (48%) and PGY-1 (45%) models. (Results from PGY-1 and PGY-2 models are given in Supplemental Digital Appendix 2 at https://links.lww.com/ACADMED/A892.) Across all PGYs, IM-ITE score was the variable with the strongest relationship to IM-CE scores. For example, for PGY-3, over half the explained variance (53%) in IM-CE scores was attributable to IM-ITE scores. The rescaled coefficients estimated that a 0.5 SD increase in IM-ITE percent correct score (about 4%) was associated with a 17.00 (95% confidence interval [CI]: 16.23–17.77) point increase in IM-CE score. USMLE Step scores were the next strongest predictors in terms of relative contribution with Step 2 CK scores showing slightly stronger associations with IM-CE scores than the other 2 step examinations, but with all 3 combined explaining 42% of the explained variance. Averaged medical knowledge milestone ratings accounted for only 3% of explained variation in IM-CE scores. Gender and medical school location explained almost no variation in IM-CE scores after accounting for standardized assessments of medical knowledge.

T2
Table 2:
Multiple Linear Regression Coefficients of Predictor Variables With IM-CE Scores for 10,239 PGY-3 Residents, From All ACGME-Accredited IM Residency Programs, Who Took the IM-ITE in 2014 or 2015a

Table 3 demonstrates the multiple logistic regression for PGY-3 (see Supplemental Digital Appendix 3 at https://links.lww.com/ACADMED/A892 for PGY-1 and PGY-2, which showed trends similar to those for PGY-3). All variables were statistically significant predictors of passing the IM-CE with IM-ITE scores having the strongest association. The odds ratio for passing the IM-CE increased by a factor of 1.80 (95% CI: 1.71–1.91) for each 0.5 SD increase in IM-ITE score. Among continuous variables, Step 2 CK had the second strongest association with passing the IM-CE as demonstrated by an odds ratio increasing by a factor of 1.23 (95% CI: 1.16–1.30) for each 0.5 SD increase in Step 2 CK score. The full model yielded Nagelkerke R2 of 0.40 with overall classification accuracy of 92%.

T3
Table 3:
Multiple Logistic Regression Coefficients of Predictor Variables With IM-CE Pass/Fail Status for 10,239 PGY-3 Residents, From All ACGME-Accredited IM Residency Programs, Who Took the IM-ITE in 2014 or 2015a

Figure 2 displays the predicted probability of passing the IM-CE based solely on IM-ITE score for each PGY. Given increasing mean IM-ITE scores in each sequential year, residents must score higher on the IM-ITE with each subsequent administration to maintain the same estimated probability of passing the IM-CE as they progress through residency. For example, PGY-1 residents scoring approximately 39% on the IM-ITE have an estimated 50% probability of passing the IM-CE. In contrast, PGY-3 residents scoring 39% on the IM-ITE have an estimated 3% probability of passing the IM-CE and would need to score approximately 55% on the IM-ITE to have an estimated 50% probability of passing the IM-CE.

F2
Figure 2:
Predicted probability of passing the American Board of Internal Medicine Internal Medicine Certification Examination (IM-CE) given a certain American College of Physicians Internal Medicine In-Training Examination (IM-ITE) score by postgraduate year (PGY), from a study of residents, from all Accreditation Council for Graduate Medical Education–accredited internal medicine residency programs, who took the IM-ITE in 2014 or 2015. Shaded gray areas represent the 95% confidence interval for predicted values.

Statistics evaluating model quality, such as R2 and classification accuracy, decreased trivially when applying the training model to the testing sample. This result indicates high stability among samples and that the models produced by the training sample generalize to the testing sample. Classification statistics for passing the IM-CE at different IM-ITE score thresholds for each PGY are presented in Supplemental Digital Appendix 4 (at https://links.lww.com/ACADMED/A892). The receiver operating characteristic curves, presented in Supplemental Digital Appendix 5 (at https://links.lww.com/ACADMED/A892), for the single predictor (IM-ITE) model based on fitting the estimated parameters to the testing sample resulted in an area under the curve of 80%, 85%, and 86% for PGY-1, PGY-2, and PGY-3 residents, respectively.

Discussion

We present the largest study of IM residents examining correlations between USMLE, IM-ITE, and IM-CE scores and other medical knowledge and demographic variables. Individuals demonstrating more medical knowledge via these assessments were more likely to score higher on and pass the IM-CE. This is not surprising, as prior performance has been often shown to predict future performance on medical knowledge assessments.34,35,37,40,44,45 However, our study provides highly reliable quantitative evidence of associations among 4 of the most important assessments of medical knowledge from medical school through residency and beyond (i.e., USMLEs, IM-ITE, milestones, and IM-CE).These findings should support residents and PDs in their efforts to more precisely identify and evaluate knowledge gaps for both personal learning and program improvement.

Related to IM-CE performance, IM-ITE score was the strongest predictor in both absolute magnitude and relative importance as demonstrated by both multiple linear and multiple logistic regression. IM-ITE scores increased with each successive PGY, and PGY-3 was the most predictive of future IM-CE performance. However, because IM-ITE performance during each PGY is predictive of future IM-CE performance, these data give PDs powerful evidence to demonstrate to residents who are underperforming that proactive identification and improvement of knowledge gaps is important to their likelihood of success on the IM-CE. More importantly, by extension, this is relevant to their patients given the robust evidence associated with IM-CE performance and improved patient-relevant outcomes.7–17 To promote improvement, the IM-ITE provides detailed feedback reports outlining educational objectives related to the questions missed on the examination, which could be used as a basis for individualized learning plans. Learning progress can thus be tracked through residency with subsequent IM-ITE administrations.

Averaged medical knowledge milestone ratings, while significantly associated with IM-CE performance in this study (and as shown previously40,44), have a limited range of values. So, for example, as most PGY-3 residents are rated level 4 (ready for unsupervised practice), little can be inferred about future IM-CE performance from this rating. However, should a PGY-3 resident be rated above or below level 4, he/she is likely to perform better or worse, respectively, than most of his/her peers on the IM-CE.

While no individual USMLE Step score was as strongly predictive of IM-CE score as IM-ITE score, the combined relative contribution of scores on all 3 USMLE Step scores (42%) was of a magnitude similar to that of IM-ITE score (53%), suggesting that USMLE use in residency recruitment decisions is not completely unjustified insofar as PDs seek to identify candidates with more medical knowledge and capable of knowledge gains.46–50 Further, the predictive ability of the USMLE Steps may be underestimated in our study because programs may have excluded residents who scored poorly on the USMLE, thus restricting the lower range of USMLE scores in this study.

Gender and medical school location were significantly associated with IM-CE performance but only minimally contributed to the overall predictive ability when accounting for other standardized assessments of medical knowledge.

Prior IM-ITE studies have varied in the use of score (i.e., percent correct) versus percentile for assessment of associations with IM-CE performance.37 For instance, a widely quoted study implied that a second-year resident whose IM-ITE performance was in the 35th percentile had an 89% chance of passing the IM-CE.24 However, this framing is concerning because it compares a within-group relative measure (IM-ITE percentile) to an assessment with an absolute passing standard (IM-CE).38 Even without changes in testing methodology and medical knowledge in ensuing years, such a result would not be generalizable to current IM residents and would not be applicable from one year to the next.

We recommend PDs focus on IM-ITE percent correct score for each year of training to predict future IM-CE performance, rather than on IM-ITE percentile. Figure 2 demonstrates the association of IM-ITE percent correct score with probability of passing the IM-CE so PDs can guide residents with a high degree of accuracy and predictive validity. While the IM-ITE and IM-CE are created independently and items on one are not equated to the other, the stability of the relationship between IM-ITE scores, not percentiles, and passing the IM-CE has been demonstrated across years of testing (see Supplemental Digital Appendix 1 at https://links.lww.com/ACADMED/A892), making this the most reliable approach for PDs to use when counseling residents on their IM-CE performance probabilities.

Our study has limitations. Because we excluded residents who did not have all variables available for analysis, this particularly excluded many residents trained in osteopathic medical schools who took the Comprehensive Osteopathic Medical Licensing Examination of the United States examiniations51 rather than the USMLE Step examinations. This resulted in excluding 1,694 DO residents from our analysis, which is equivalent to about 5% of the total included study population. Given the advent of the Single GME Accreditation System,52 which began in 2014 with an agreement between the ACGME and American Osteopathic Association to unify GME accreditation after a transition period from July 2015 to June 2020, more residents with osteopathic training will soon enter the pipeline for assessment using the IM-ITE and IM-CE. Given the relatively small numbers of osteopathic IM residents who had participated in this transition and taken the IM-CE during the years of our study (24 in 2016, 84 in 2017, and 333 in 2018), our overall results were unlikely affected by their exclusion. However, it will be appropriate to analyze this important and growing group of residents in ACGME-accredited IM programs in the future after the Single GME Accreditation System transition is completed.

Another limitation, or perhaps caution, relates to the interpretation of the model for IM-CE pass/fail predictions: While our model has very high classification accuracy (92%) for passing the IM-CE on the first attempt, the first-taker pass rate for all residents was 91%. Thus, a model without adjustment for any predictor variables assuming all residents would pass the IM-CE on first attempt would yield a classification accuracy of 91%.

Finally, undoubtedly there are other components to acquisition and demonstration of medical knowledge that were not included in our study. For instance, our model explained 51% of observed variance in IM-CE scores for PGY-3s, which is remarkably high for medical education studies. Within our model, IM-ITE scores accounted for 53% of the explained, not the observed, variance. So, 49% of the total observed variance in IM-CE scores was not explained by our model, thus IM-ITE scores predict only 27% of the total observed variance (0.51 × 0.53). Given this, the predictive ability of IM-ITE scores for IM-CE scores is far from perfect. PDs and residents cannot rely solely on IM-ITE scores to assess readiness to pass the IM-CE; that is, some residents predicted to fail the IM-CE based on their IM-ITE score will pass and some residents predicted to pass it will fail. This is one reason that the IM-ITE should not be used for high-stakes decisions and that the ACP has recommended against its use for resident performance evaluation.3

However, if used as intended, the IM-ITE can be a powerful tool for low-stakes decisions, supporting overall knowledge acquisition in residency when combined with other evidence-based methods. For instance, a systematic, frequent (e.g., daily), modest (e.g., as little as 20 minutes per day), intentional approach to medical knowledge acquisition has been shown to be associated with significant gains in medical knowledge over time.37 Thus, residents and PDs may well use the IM-ITE for global knowledge assessment, which, combined with program-specific assessments, could identify knowledge gaps that may inform individualized learning plans to help close knowledge gaps through intentional study.

In conclusion, our study demonstrates IM-ITE scores are the strongest known predictor of IM-CE performance and, by extension, may link the USMLEs and IM-ITE with the patient-relevant outcomes that are associated with the IM-CE.7–17 We believe our study gives residents and PDs the best information currently available to apply the USMLEs and IM-ITE as tools for developing evidence-based individualized learning plans. This is important because with appropriate guidance, physicians may gain medical knowledge throughout the course of their IM residency that will be of benefit to their future careers and, more importantly, to their future patients.

References

1. Association of American Medical Colleges. About the MCAT exam. https://students-residents.aamc.org/applying-medical-school/taking-mcat-exam/about-mcat-exam. Accessed March 4, 2020.
2. Federation of State Medical Boards, National Board of Medical Examiners. United States Medical Licensing Examination 2019 Bulletin of Information. https://www.usmle.org/pdfs/bulletin/2019bulletin.pdf. Published 2018. Accessed March 4, 2020.
3. American College of Physicians. IM-ITE. https://www.acponline.org/featured-products/medical-educator-resources/im-ite. Accessed March 4, 2020.
4. American Board of Internal Medicine. Internal Medicine Certification exam. https://www.abim.org/certification/exam-information/internal-medicine/exam-content.aspx. Accessed March 4, 2020.
5. Gray B, Reschovsky J, Holmboe E, Lipner R. Do early career indicators of clinical skill predict subsequent career outcomes and practice characteristics for general internists? Health Serv Res. 2013;48:1096–1115.
6. Willett LL, Halvorsen AJ, Adams M, et al. Factors associated with declining residency program pass rates on the ABIM certification examination. Am J Med. 2016;129:759–765.
7. Lipner RS, Hess BJ, Phillips RL Jr.. Specialty board certification in the United States: Issues and evidence. J Contin Educ Health Prof. 2013;33(suppl 1):S20–S35.
8. Pham HH, Schrag D, Hargraves JL, Bach PB. Delivery of preventive services to older adults by primary care physicians. JAMA. 2005;294:473–481.
9. Turchin A, Shubina M, Chodos AH, Einbinder JS, Pendergrass ML. Effect of board certification on antihypertensive treatment intensification in patients with diabetes mellitus. Circulation. 2008;117:623–628.
10. Holmboe ES, Weng W, Arnold GK, et al. The comprehensive care project: Measuring physician performance in ambulatory practice. Health Serv Res. 2010;45(6 Pt 2):1912–1933.
11. Reid RO, Friedberg MW, Adams JL, McGlynn EA, Mehrotra A. Associations between physician characteristics and quality of care. Arch Intern Med. 2010;170:1442–1449.
12. Sirovich BE, Lipner RS, Johnston M, Holmboe ES. The association between residency training and internists’ ability to practice conservatively. JAMA Intern Med. 2014;174:1640–1648.
13. Lipner RS, Young A, Chaudhry HJ, Duhigg LM, Papadakis MA. Specialty certification status, performance ratings, and disciplinary actions of internal medicine residents. Acad Med. 2016;91:376–381.
14. Papadakis MA, Arnold GK, Blank LL, Holmboe ES, Lipner RS. Performance during internal medicine residency training and subsequent disciplinary action by state licensing boards. Ann Intern Med. 2008;148:869–876.
15. Khaliq AA, Dimassi H, Huang CY, Narine L, Smego RA Jr.. Disciplinary action against physicians: Who is likely to get disciplined? Am J Med. 2005;118:773–777.
16. Norcini JJ, Lipner RS, Kimball HR. Certifying examination performance and patient outcomes following acute myocardial infarction. Med Educ. 2002;36:853–859.
17. Norcini JJ, Kimball HR, Lipner RS. Certification and specialization: Do they matter in the outcome of acute myocardial infarction? Acad Med. 2000;75:1193–1198.
18. Jurich D, Duhigg LM, Plumb TJ, et al. Performance on the nephrology in-training examination and ABIM nephrology certification examination outcomes. Clin J Am Soc Nephrol. 2018;13:710–717.
19. Indik J, Duhigg L, McDonald F, et al. ACC In-Training Examination predicts outcomes on the ABIM certification examination. J Am Coll Cardiol. 2017;69:2862–2868.
20. Lohr KM, Clauser A, Hess BJ, et al. Relationship between performance on the rheumatology in-training and certification examinations. Arthritis Rheumatol. 2015;67:3082–3090.
21. Grabovsky I, Hess BJ, Haist SA, et al. The relationship between performance on the infectious diseases in-training and certification examinations. Clin Infect Dis. 2015;60:677–683.
22. Collichio FA, Hess BJ, Muchmore EA, et al. Medical knowledge assessment by hematology and medical oncology in-training examinations are better than program director assessments at predicting subspecialty certification examination performance. J Cancer Educ. 2017;32:647–654.
23. Cantwell JD. The Mendoza Line and in-training examination scores. Ann Intern Med. 1993;119:541.
24. Grossman RS, Murata GH, Fincher RM, et al. Predicting performance on the American Board of Internal Medicine Certifying Examination: The effects of resident preparation and other factors. Crime study group. Acad Med. 1996;71(10 suppl):S74–S76.
25. Grossman RS, Fincher RM, Layne RD, Seelig CB, Berkowitz LR, Levine MA. Validity of the in-training examination for predicting American Board of Internal Medicine certifying examination scores. J Gen Intern Med. 1992;7:63–67.
26. Rollins LK, Martindale JR, Edmond M, Manser T, Scheld WM. Predicting pass rates on the American Board of Internal Medicine certifying examination. J Gen Intern Med. 1998;13:414–416.
27. Waxman H, Braunstein G, Dantzker D, et al. Performance on the internal medicine second-year residency in-training examination predicts the outcome of the ABIM certifying examination. J Gen Intern Med. 1994;9:692–694.
28. Brateanu A, Yu C, Kattan MW, Olender J, Nielsen C. A nomogram to predict the probability of passing the American Board of Internal Medicine examination. Med Educ Online. 2012;17:18810.
29. Babbott SF, Beasley BW, Hinchey KT, Blotzer JW, Holmboe ES. The predictive validity of the internal medicine in-training examination. Am J Med. 2007;120:735–740.
30. Kay C, Jackson JL, Frank M. The relationship between internal medicine residency graduate performance on the ABIM certifying examination, yearly in-service training examinations, and the USMLE Step 1 examination. Acad Med. 2015;90:100–104.
31. Desai SV, Asch DA, Bellini LM, et al.; iCOMPARE Research Group. Education outcomes in a duty-hour flexibility trial in internal medicine. N Engl J Med. 2018;378:1494–1508.
32. Mizuno A, Tsugawa Y, Shimizu T, et al. The impact of the hospital volume on the performance of residents on the General Medicine In-Training Examination: A multicenter study in Japan. Intern Med. 2016;55:1553–1558.
33. Nishizaki Y, Mizuno A, Shinozaki T, et al. Educational environment and the improvement in the General Medicine In-training Examination score. J Gen Fam Med. 2017;18:312–314.
34. McCoy CP, Stenerson MB, Halvorsen AJ, Homme JH, McDonald FS. Association of volume of patient encounters with residents’ in-training examination performance. J Gen Intern Med. 2013;28:1035–1041.
35. McDonald FS, Zeger SL, Kolars JC. Associations between United States Medical Licensing Examination (USMLE) and Internal Medicine In-Training Examination (IM-ITE) scores. J Gen Intern Med. 2008;23:1016–1019.
36. McDonald FS, Zeger SL, Kolars JC. Associations of conference attendance with internal medicine in-training examination scores. Mayo Clin Proc. 2008;83:449–453.
37. McDonald FS, Zeger SL, Kolars JC. Factors associated with medical knowledge acquisition during internal medicine residency. J Gen Intern Med. 2007;22:962–968.
38. American Board of Internal Medicine. About ABIM exams. https://www.abim.org/about/exam-information.aspx. Accessed March 4, 2020.
39. Accreditation Council for Graduate Medical Education, American Board of Internal Medicine. The Internal Medicine Milestone Project. https://acgme.org/acgmeweb/Portals/0/PDFs/Milestones/InternalMedicineMilestones.pdf. Published July 2015. Accessed March 4, 2020.
40. Hauer KE, Vandergrift J, Hess B, et al. Correlations between ratings on the resident annual evaluation summary and the internal medicine milestones and association with ABIM certification examination scores among US internal medicine residents, 2013-2014 JAMA. 2016;316:2253–2262.
41. Nimon KF, Oswald FL. Understanding the results of multiple linear regression: Beyond standardized regression coefficients. Organ Res Methods. 2013;16:650–674.
42. Thomas D, Hughes E, Zumbo B. On variable importance in linear regression. Soc Indic Res. 1998;45:253–275.
43. Zhou X-H, Obuchowski NA, McClish DK. Statistical Methods in Diagnostic Medicine. 2002.1st ed. New York, NY: John Wiley & Sons;
44. Hauer KE, Vandergrift J, Lipner RS, Holmboe ES, Hood S, McDonald FS. National internal medicine milestone ratings: Validity evidence from longitudinal three-year follow-up. Acad Med. 2018;93:1189–1204.
45. Hauer KE, Clauser J, Lipner RS, et al. The internal medicine reporting milestones: Cross-sectional description of initial implementation in U.S. residency programs. Ann Intern Med. 2016;165:356–362.
46. Marcus-Blank B, Dahlke JA, Braman JP, et al. Predicting performance of first-year residents: Correlations between structured interview, licensure exam, and competency scores in a multi-institutional study. Acad Med. 2019;94:378–387.
47. Kanna B, Gu Y, Akhuetie J, Dimitrov V. Predicting performance using background characteristics of international medical graduates in an inner-city university-affiliated internal medicine residency training program. BMC Med Educ. 2009;9:42.
48. Liang M, Curtin LS, Signer MM, Savoia MC. Unmatched U.S. allopathic seniors in the 2015 main residency match: A study of applicant behavior, interview selection, and Match outcome. Acad Med. 2017;92:991–997.
49. Kenny S, McInnes M, Singh V. Associations between residency selection strategies and doctor performance: A meta-analysis. Med Educ. 2013;47:790–800.
50. Go PH, Klaassen Z, Chamberlain RS. Residency selection: Do the perceptions of US programme directors and applicants match? Med Educ. 2012;46:491–500.
51. National Board of Osteopathic Medical Examiners. COMLEX-USA. https://www.nbome.org/exams-assessments/comlex-usa. Accessed March 4, 2020.
52. Accreditation Council for Graduate Medical Education. Single GME accreditation system. https://acgme.org/What-We-Do/Accreditation/Single-GME-Accreditation-System. Accessed March 4, 2020.

Supplemental Digital Content

Copyright © 2020 by the Association of American Medical Colleges