Table 3 presents the bivariate correlations of the explanatory variables and USMLE scores. Correlation analysis between the composite (average) subject exam scores across all 6 clerkships and the USMLE Step 1 and 2 CK exams was also performed; the correlations with the USMLE Step 1 and Step 2 CK exams were quite strong (0.69 [P <.001] and 0.77 [P <.001], respectively). As shown, USMLE Step 1 scores had moderate-to-high positive correlations with all subject exam scores and with the cumulative GPA at the end of the second year, ranging from 0.46 (95% CI: 0.39, 0.53, P <.01) to 0.74 (95% CI: 0.70, 0.78, P <.01). USMLE Step 2 CK scores were also positively correlated with all explanatory variables, with correlations ranging from 0.51 (95% CI: 0.44, 0.57, P <.01) to 0.68 (95% CI: 0.63, 0.73, P <.01). In addition, NBME Clinical Subject Exam scores for the different clerkships were positively correlated with each other and with the cumulative second-year GPA and with the third-year GPA.
The results of the stepwise linear regression modeling are provided in Chart 1. When entered into the regression model first, NBME Subject Exam scores in the primary care clerkships explained 44% of the variance in USMLE Step 1 scores. The subject scores for surgery, psychiatry, and obstetrics and gynecology added an additional 5% of variance. The second-year cumulative GPA was significantly associated with the Step 1 score after controlling for all NBME Subject exam scores, adding 13% additional variance to the model. The adjusted R2 of the final model was 0.62, indicating that the explanatory variables together were associated with 62% of the variance of the Step 1 score.
For the USMLE Step 2 CK score, 55% of the variance was explained by the subject scores in the primary care clerkships. The scores for the other three clerkships, when entered together as a block in the second step of the regression model, accounted for an additional 6% of variance. The cumulative second-year GPA explained only 3% additional variance beyond NBME scores, although there was still a significant association with the Step 2 CK score. For the regression model of the Step 2 CK score, the third-year GPA accounted for only 1% additional variance beyond NBME scores. The adjusted R2 of the final model was 0.61, indicating the explanatory variables together were associated with 61% of the variance of the Step 2 CK score.
Discussion and Conclusions
Our results suggest that for our study sample, NBME Clinical Subject Examination scores in core clerkships were moderately-to-highly correlated with scores on the USMLE Step 1 and 2 CK exams. Our results also indicate that NBME Clinical Subject Exam scores in multiple clerkships correlated with one another. Considering that passing the USMLE is a necessary component for physicians to obtain licensure, and that a majority of medical schools across the country utilize the subject exams, we believe there is value in demonstrating a relationship between the USMLE exams and the subject exams. Indeed, our data would suggest the subject exams could be considered as “surrogate” exams for USMLE Step 2 CK exams. The relatively high degree of correlation between the subject exams and the USMLE exams provides a degree of validation for using the subject exams to measure students’ medical knowledge. Furthermore, the relationship between multiple subject exam scores with measures of knowledge such as the mean GPA or scores on the USMLE Step examinations is valuable, since performance on a single subject examination may not be representative of a student’s overall knowledge base. This suggests that performance on subject exams across the breadth of clerkships would likely be more indicative of overall knowledge.
The limited extant published data regarding correlations between subject exams and USMLE exams are summarized in Table 4. As shown, correlations between subject exams and USMLE scores range from weak (r = 0.18) to very strong (r = 0.66), depending on the discipline and which USMLE exam is considered. One might expect larger correlations between subject exam scores and the clinically oriented USMLE Step 2 CK exam.
These findings are consistent with the theory of context specificity; this is particularly true regarding the general range of correlations found between scores on individual subject exams and on the USMLE Steps, as well as the larger correlation indicated when we considered multiple subject examination scores using our composite variable (additional “contexts”). In general, based on the correlations shown in Table 4 and our own data, correlations were higher between scores on subject exams and those on the USMLE Step 2 CK exam compared to scores on Step 1, although the magnitude of the differences in correlations was small.
However, the moderate correlations between subject exams and Step 1 scores may be more indicative of students’ academic aptitudes; those who perform well on Step 1 may also generally perform well on subject exams and on the Step 2 CK exam as a reflection of their own knowledge base and test-taking skills. This premise may be supported by the moderate-to-strong correlations between the second-year cumulative GPA and both the Step 1 and Step 2 CK exam scores (0.74 and 0.68 respectively), and between Step 1 and Step 2 CK scores reported by us (0.69) and others (0.67 to 0.78).3–5 This concept is further supported by the relatively strong correlation between the second-year cumulative GPA and the third-year clinical GPA (r = 0.66). It is not surprising that students who perform well in the earlier years of undergraduate training would continue to perform well in clinical rotations and on objective assessments during the clinical years. Additionally, there may be a component of common-method variance; that is, spurious variance attributable to the measurement method (i.e., a knowledge test) rather than to the constructs the measures represent (i.e., medical knowledge and skill).10
We also demonstrated moderate-to-high correlations between all subject exam scores and the third-year GPA, and between the third-year GPA with scores on both USMLE exams, particularly Step 2. As hypothesized, the third-year GPA was more highly correlated with the Step 2 CK score than with the Step 1 score.
The regression analyses were quite enlightening. Given the moderate-to-high correlations between subject exam scores and USMLE scores, one might anticipate that variance in USMLE scores would be mostly accounted for by variance in subject exam scores. Indeed, we found that the subject exam scores and GPA were associated with a sizable proportion of the variance in USMLE Step 1 and Step 2 CK scores (62% and 61%, respectively). As hypothesized, primary care subject exam scores were able to account for most of the variance in both Step 1 and Step 2 CK scores when entered into the regression model first; the further addition of the non-primary care scores (obstetrics and gynecology, surgery, and psychiatry) accounted for only an additional 5% of Step 1 and 6% of Step 2 variance. One possible explanation for primary care topics’ explaining a large proportion of the Step 2 CK variance is that primary care issues, while emphasized in primary care clerkships, are also addressed in non-primary care clerkships. As a result, primary care topics are covered to some degree across the entire clerkship year, and may be a significant component of the USMLE exam even if questions are specifically identified to address an obstetrics and gynecology or surgery topic, for example. Furthermore, many of the clinical scenarios considered during the basic science years are topics encountered in primary care, further reinforcing this content for students.
The second-year cumulative GPA also differed in its contribution to the variance in the Step exams, accounting for 13% additional variance for Step 1 but only 1% for Step 2 CK. Although the cumulative second-year GPA correlated with both Step exams, correlation with Step 1 was greater. Since the second-year GPA is reflective of a predominantly basic science curriculum, it might be expected to have a greater contribution to variance on Step 1, since Step 1 is a basic science-oriented exam.
The minimal contribution of the third-year GPA to the variance in Step 2 CK scores was, however, unexpected. We anticipated that the third-year GPA, reflective of the clinical clerkship year, would account for a sizeable proportion of variance in the Step 2 CK examination. It is possible that since the third-year GPAs are derived from individual clerkship final grades, which themselves are dependent on subject exam scores, the GPA in and of itself adds little to the effect of the subject exam scores.
Although the subject exams explained much of the variance in the USMLE exams, it is important to not over-interpret their contributions, as nearly 40% of variance was still unaccounted for. There are numerous other factors that could affect performance on the USMLE exam that were not measured in our study, such as test-taking strategies, fatigue, anxiety, and clerkship length and/or timing in the academic year. However, it is reassuring to know that subject exam performance does seem to be a reasonable predictor of USMLE exam outcomes.
The correlation of subject exam scores to Step exam performance could assist resident selection. Although a number of criteria may be used to select residents, performance on the Step exams, particularly Step 2 CK, is likely important, considering the necessity of passing Step 3 during residency training. However, Step 2 CK scores may not be available to program directors at the time of selection or ranking of student applicants. The fact that subject exams were correlated with Step 2 CK exam performance suggests that subject exam data could be useful to program directors in lieu of Step 2 CK exam scores in their consideration of applicant selection.
The correlation of the subject exam scores to the Step exams could also assist individual students in making educational plans. For those students having difficulty with one or more subject exams, a medical school may decide to offer, or mandate, specific study periods or formal preparation courses prior to taking the Step 2 CK exam. Theoretically, these proactive measures could decrease the potential for failure of the Step 2 CK exam.
There were several limitations to our study. Data were collected from a single institution; thus we cannot necessarily extrapolate our findings to other institutions. Additionally, our study addressed correlation and not causation. Although the variances found do lend themselves to estimates of prediction, only a prospective study would truly provide for analysis of predictive ability. We did not account for the effect of clerkship timing and/or length of clerkships on USMLE or subject exam scores. Published data are mixed as to the effect of these entities.5,6,11–25 Furthermore, it is important to note that the specific variables, and the order in which the variables were entered into the regression model, likely affected results relative to the percentage of variance in USMLE examinations explained by these variables. However, in order to specifically address the objective of determining the variance accounted for by a combination of primary care subject exams, we elected to enter variables in a specific order.
Strengths of this study include the large number of students considered, subject examination scores across multiple clerkships for three years, the correlations of subject exam scores with each other as well as with scores of USMLE exams and the GPA, and the consideration of the contribution of subject exams scores to the variance in Step 1 and 2 CK exams.
In summary, our findings strongly suggest that NBME Subject Exams exhibit moderate-to-large positive correlations with USMLE Step 1 and Step 2 CK exam scores, and that subject exams explain considerable variance in performance on USMLE exams. Considering the importance of USMLE performance in the progression of a physician’s career, it is reassuring to know that subject exams appear to provide a reasonably valid estimate of USMLE performance during undergraduate medical training.
Other disclosures: None.
Ethical approval: This study was approved by the Institutional Review Board of the Uniformed Services University of the Health Sciences.
Disclaimer: The opinions expressed in this article are those of the authors alone and are not to be construed as official or reflecting the view of the Department of Defense or the Uniformed Services University of the Health Sciences.
2. Myles T, Galvez-Myles R. USMLE Step 1 and 2 scores correlate with family medicine clinical and examination scores. Fam Med. 2003;35:510–513
3. Ogunyemi D, Taylor-Harris D. Factors that correlate with the U.S. Medical Licensure Examination Step-2 scores in a diverse medical student population. J Natl Med Assoc. 2005;97:1258–1262
4. Myles TD, Henderson RC. Medical licensure examination scores: relationship to obstetrics and gynecology examination scores. Obstet Gynecol. 2002;100(5 Pt 1):955–958
5. Ripkey DR, Case SM, Swanson DB. Identifying students at risk for poor performance on the USMLE Step 2. Acad Med. 1999;74(10 Suppl):S45–S48
6. Ogunyemi D, De Taylor-Harris S. NBME Obstetrics and Gynecology clerkship final examination scores: predictive value of standardized tests and demographic factors. J Reprod Med. 2004;49:978–982
7. Armstrong A, Dahl C, Haffner W. Predictors of performance on the National Board of Medical Examiners obstetrics and gynecology subject examination. Obstet Gynecol. 1998;91:1021–1022
8. Myles TD. United States Medical Licensure Examination step 1 scores and obstetrics-gynecology clerkship final examination. Obstet Gynecol. 1999;94:1049–1051
9. Durning SJ, Artino AR, Boulet JR, Dorrance K, van der Vleuten C, Schuwirth L. The impact of selected contextual factors on experts’ clinical reasoning performance (does context impact clinical reasoning performance in experts?). Adv Health Sci Educ Theory Pract. 2012;17:65–79
10. Podsakoff PM, MacKenzie SB, Lee JY, Podsakoff NP. Common method biases in behavioral research: a critical review of the literature and recommended remedies. J Appl Psychol. 2003;88:879–903
11. Metheny WP, Holzman GB. Student performance on the NBME Part II subtest and subject examination in obstetrics-gynecology. J Med Educ. 1988;63:456–462
12. Ripkey DR, Case SM, Swanson DB. Predicting performances on the NBME Surgery Subject Test and USMLE Step 2: the effects of surgery clerkship timing and length. Acad Med. 1997;72(10 Suppl 1):S31–S33
13. Clark KH, Jelovsek FR. Effect of clerkship timing on third-year medical students’ grades and NBME scores in an obstetrics-gynecology clerkship. Acad Med. 1992;67:865
14. Baciewicz FA Jr, Fagley J, Weaver M, Yeasting R, Thomford NR. The effect of surgery clerkship timing on fourth-year students’ surgery knowledge. Acad Med. 1990;65:543
15. Veloski JJ, Hojat M. Learning in medical school clerkships: the effects of time on comprehensive examination scores. Proc Annu Conf Res Med Educ. 1983;22:19–24
16. Whalen JP, Moses VK. The effect on grades of the timing and site of third-year internal medicine clerkships. Acad Med. 1990;65:708–709
17. Baciewicz FA Jr, Arent L, Weaver M, Yeastings R, Thomford NR. Influence of clerkship structure and timing on individual student performance. Am J Surg. 1990;159:265–268
18. Gary NE, Rosevear GC. Effect of reduction in length of third-year clerkships on students’ academic performance. J Med Educ. 1988;63:406–407
19. Vosti KL, Bloch DA, Jacobs CD. The relationship of clinical knowledge to months of clinical training among medical students. Acad Med. 1997;72:305–307
20. Edwards RK, Davis JD, Kellner KR. Effect of obstetrics-gynecology clerkship duration on medical student examination performance. Obstet Gynecol. 2000;95:160–162
21. Hampton HL, Collins BJ, Perry KG Jr, Meydrech EF, Wiser WL, Morrison JC. Order of rotation in third-year clerkships. Influence on academic performance. J Reprod Med. 1996;41:337–340
22. Reteguiz JA, Crosson J. Clerkship order and performance on family medicine and internal medicine National Board of Medical Examiners Exams. Fam Med. 2002;34:604–608
23. Whalen JP. Investigating whether timing of students’ third-year internal medicine clerkships affects their performances as seniors on the NBME examination. Acad Med. 1991;66:709
24. Smith ER, Dinh TV, Anderson G. A decrease from 8 to 6 weeks in obstetrics and gynecology clerkship: effect on medical students’ cognitive knowledge. Obstet Gynecol. 1995;86:458–460
25. Case SM, Ripkey DR, Swanson DB. The effects of psychiatry clerkship timing and length on measures of performance. Acad Med. 1997;72(10 Suppl 1):S34–S36 Reference cited only in Table 4
© 2012 Association of American Medical Colleges
26. Spellacy WN, Dockery JL. A comparison of medical student performance on the obstetrics and gynecology National Board Part II examination and a comparable examination given during the clerkship. J Reprod Med. 1980;24:76–78