Young physicians must pass the United States Medical Licensing Examination (USMLE) for state licensure and specialty board certification. During 2000–2001, nearly all U.S. medical schools (115) required that students attempt Step 1 at some point during their MD programs.1 Students at schools with curricula consisting of two years of basic science education followed by two years of clinical rotations are often required to complete Step 1 of this comprehensive examination immediately after the end of the second year of medical school to document their basic science knowledge before proceeding into the clinical sciences.
The consequences of performance on the USMLE extend beyond medical school. Step 1 has become an important factor in screening and selecting residency candidates through the electronic application process.2 This cumulative examination has become even more of a high-stakes test among those vying for the most competitive residency programs. For example, the Web site for the San Francisco Matching Program (〈http://www.sfmatch.org/〉) posts in the dean section the mean Step 1 scores for students applying to graduate medical education programs in otolaryngology, ophthalmology, and neurological surgery. The mean scores for these competitive specialties are significantly higher than the national mean scores.
Before the examination became available on computer in 1999, Step 1 was administered twice annually as a paper-and-pencil test. Previously, the majority of examinees chose to take the test in May, but now the computer-based version can be scheduled throughout the year. Students can schedule the examination at their own personal convenience; however, they are subject to the confines of their medical schools' curricula and the faculty's policies for promotion and graduation.
Little is known about how this change to flexible test scheduling affects student performances. This information can have an impact on the advice that advisors must provide to medical students. In this study, we analyzed the relationships between students' Step 1 scores and the time intervals that had passed after they had completed the second-year medical school curriculum.
The total sample consisted of 627 students who matriculated in 1997 through 1999 at a large private medical school and had completed the Step 1 examination before July 2002. Two students were excluded from the study because computer irregularities at a testing center interrupted their sessions and invalidated the scores on their first attempts at Step 1. Test dates for 601 students who took Step 1 for the first time were collapsed into a series of six one-week periods between June 1 and July 12 in 1999, 2000, and 2001 depending on the test date.
Twenty-six (4%) students who scheduled the comprehensive examination seven to 55 weeks after completing the second-year curriculum were analyzed separately because the small sample sizes within the one-week periods beyond July 12 (week six) were inadequate for statistical analysis. These students were classified into four subgroups based on their reasons for the scheduling delays. The four categories were (1) personal issues such as research projects, family illness, weddings, or vacations that precluded taking the exam on schedule (12 students or 2% of the sample); (2) academic difficulty that required remediation at the end of the second year before scheduling the exam (eight students or 1% of the sample); (3) medical problems such as pregnancy (four students or <1% of the sample); and (4) a competitive one-year research fellowship in pathology (two students).
A linear regression model with total Step 1 score as the dependent variable was developed using data for 215 students who had matriculated in 1996 and taken Step 1 in May 1998 before computer administration. The independent variables included each student's mean for all MCAT science scales, gender, and grades based on objective examinations in pathology, pharmacology, and introduction to clinical medicine. Gender was included because local as well as national studies have shown that women sometimes perform less well in preclinical disciplines, especially on the Board examinations.3,4,5 The model accounted for 79% of the variance in Step 1 scores, with a standard error of estimate of ±9. The equation was used to predict Step 1 scores for each of the 601 students in the entering classes of 1997 through 1999.
Over two thirds of the students scheduled their test dates during the last two weeks of June, as shown in Table 1. A handful (2%) completed the test in the early days of June, while another 15% postponed their test dates until the very end of June or early July. The overall mean of 213, standard deviation of 21, and range of scores (156 to 269) in the sample of 601 students are close to national norms.
The students who scheduled Step 1 in the earlier weeks had slightly higher records of academic performance, which is reflected in their predicted scores. The overall difference in actual scores across time periods is not statistically significant (p = .10). There was even less difference in the means across the six time periods after actual scores were adjusted using analysis of covariance to consider academic performance as measured by medical school test scores and MCAT science scores. The analysis of covariance of differences in adjusted scores across the six time periods was not statistically significant (p = .31).
The overall passing rate was 94.3%, which ranged from a high of 100% for the 13 students who tested in the first week of June to a low of 90.8% for the 65 students in the first week of July. The overall differences in the frequencies of pass/fail across the six time periods were not statistically significant (chi-square(5) = 2.76, p = .74).
The results for the 26 students who scheduled Step 1 between late July and the following June (weeks seven to 55) were consistent with the main analysis of 601 students and did not suggest any effect of time on test scores. Four of the 12 students with personal issues scheduled the exam during weeks seven and eight, late July. Their mean predicted and actual scores were 190 and 196, respectively. The remaining eight students with personal issues scheduled the exam during weeks nine and ten in early August. Their mean predicted and actual scores were 199 and 203. One student failed Step 1 in each of these subgroups of students with personal problems. The eight students with academic difficulties scheduled Step 1 in early August (weeks nine and ten). Four of these students failed the examination, and the mean predicted and actual scores were 179 and 182, respectively. The test dates for the four students with medical problems spanned almost one year, week seven to week 55. Their mean predicted and actual scores were 218 and 222; each student passed the examination. The remaining two students who participated in a one-year research fellowship completed Step 1 during week 17 and successfully passed.
Many medical students face a dilemma as they schedule a test date for Step 1. Some students may be inclined to sit for the examination as quickly as possible, so that they can pass the test and move on to the clinical arena. Others may feel a need to review and integrate what they have learned to assure thorough preparation before attempting the test.6 The latter strategy has become even more critical for students who have followed traditional, departmental-based curricula over the past three years as new, interdisciplinary content specifications have replaced the discipline-based topics of the earlier NBME Part I.7
The literature provides conflicting evidence that might be used to guide a student's choice of a test date. One study of the timing of Step 1 before computer administration concluded “the later, the better.”8 This study was limited by a small study sample, usage of sample questions rather than the entire USMLE examination, and a high failure rate. On the other hand, several studies of the second step of the licensing examination have demonstrated that scores are inversely related to the time interval between the completion of a third-year clinical clerkship and the timing of the comprehensive examination. Veloski and Hojat studied NBME Part II, the predecessor of USMLE Step 2, and found that the closer that the surgery, obstetrics—gynecology, or psychiatry clerkships were taken to the test, the higher the subtest scores were for the discipline-specific sections.9 Smith et al.10 and Ripkey et al.11 noted similar findings for Step 2 scores in obstetrics—gynecology and surgery, respectively. These results support the findings of the cognitive sciences that time interval is important in retaining as well as recalling learned material.12
Although the mean Step 1 scores of the students who took the examination immediately in early June were higher, there was self-selection in this group. After adjustment based on medical school performance, as reflected in the scores predicted from the linear regression model, mean student performances on the comprehensive examination within all time periods were similar regardless of when the USMLE Step 1 was taken. At our medical school, factors have been identified that predict test taker's outcomes for Step 1. These internal markers include success on MCAT science scores as well as on several of the second-year courses (introduction to clinical medicine, pathology, pharmacology). As depicted in Table 1, the differences in actual performance were attributable to differences in students' academic abilities that could be predicted two weeks or more before they took the test.
This finding has important implications when advising students about scheduling this high-stakes, cumulative examination. Students, on average, did not benefit or suffer if they took the test during the early week of June 1 or much later, in the second week of July. Therefore, faculty should not encourage students to delay taking the test under the assumption that this will necessarily enhance their scores. The results indicate that students are not at a disadvantage if, after proper test preparation, they sit for the exam in early June just three weeks after completing their second-year curricula. Whether or not a specific, individual student is able to prepare adequately in just two or three weeks must be examined closely. It is noteworthy that by scheduling toward the end of June, about 70% of the students allocated up to four or five weeks for test preparation.
Future studies of the timing of Step 1 might look more closely at the impact of students' test preparation strategies on Step 1 scores. While it seems more likely that the students who scheduled the test in late June or July devoted more time to test preparation, the results indicate that on average this additional time had no impact on their scores. Variables that may influence test preparation include study time, educational materials, coaching courses, study style, and study behavior (e.g., procrastination). Thadani et al. also reported that these factors, especially coaching courses, had little effect on USMLE Step 1 scores.6 Future investigation is necessary in order to better counsel students about specific methods of preparing for this comprehensive examination. It is important to conduct a similar study of Step 2 because there may be even wider latitude in selection of test dates, and correspondingly even more complex questions related to knowledge retention and test preparation.
The results of this study are limited to a two-month window from the time the second-year course work is completed. After analyzing the subgroup of students who delayed Step 1 for personal or academic reasons, the findings were consistent with the main analysis and did not suggest any effect of time on test scores. As expected, the mean predicted scores and mean actual scores were low among the students with academic difficulty during the second year of medical school. Having said that, the timing of Step 1 did not affect this group or the students who had academic difficulty during the first year of medical school and were included in the main analysis. Although we were unable to detect any differences in the pass/fail rates for the USMLE examination across time, this may have been due to the small number of failures in this sample. A larger and more heterogeneous sample is needed to address this important question.
This study used a quasi-experimental design in which analysis of covariance was used to adjust outcomes for prior academic performance. The decision to use analysis of covariance was supported by a linear regression model using a set of five independent variables that explained 79% of the variance in the dependent variable. The predicted scores within each time period implied self-selection. The earliest group had the highest performance, while the latest group had the lowest performance. While statistical adjustment without randomization has limitations, it was impractical to consider random assignment of students' testing schedules in order to avoid this design.
We conclude that the time interval between completing the second-year medical curriculum and the date of the USMLE Step 1 examination does not alter a student's outcome within the first two months after completing the second-year of medical school, even if the test taker is at risk of a poor performance. These results have important implications for counseling students about their scheduling decisions regarding the comprehensive examination and underscore a need for future studies of test preparation and scheduling for high-stakes examinations.
1. Barzansky B, Etzel SI. Educational programs in US medical schools, 2000–2001. JAMA. 2001;286:1049–55.
2. Adams LJ, et al. National survey of internal medicine residency programs of their 1st-year experience with the electronic residency application service and national resident match program changes. Teach Learn Med. 2001;13:221–6.
3. Dawson B, et al. Performance on the National Board of Medical Examiners Part I examination by men and women of different race and ethnicity. JAMA. 1994;272:674–9.
4. Arnold L, Willoughby TL, Calkins V, Jensen T. The achievement of men and women in medical school. J Am Med Wom Assoc. 1981;36:213–21.
5. Herman MW, Veloski JJ. Premedical training, personal characteristics and performance in medical school. Med Educ. 1981;15:363–7.
6. Thadani RA, Swanson DB, Galbraith RM. A preliminary analysis of different approaches to preparing for the USMLE Step 1. Acad Med. 2000;75(10 suppl):S40–S42.
7. Swanson DB, Case SM. Assessment in basic science instruction: directions for practice and research. Adv Health Sci Educ. 1997;2:71–84.
8. Petrusa ER, Reilly CG, Lee LS. Later is better: projected USMLE performance during medical school. Teach Learn Med. 1995;7:163–7.
9. Hojat M, Veloski JJ. Subtest scores of a comprehensive examination of medical knowledge as a function of retention interval. Psychol Rep. 1984;55:579–85.
10. Smith ER, Anderson G. A decrease from 8 to 6 weeks in obstetrics and gynecology clerkships: effect on medical students' knowledge. Obstet Gynecol. 2002;86:458–60.
11. Ripkey D, Case SM, Swanson DB. Predicting performances on the NBME surgery subject test and USMLE Step 2: the effects of surgery clerkship timing and length. Acad Med. 1997;72(10 suppl):S31–S33.
12. Regehr G, Norman GR. Issues in cognitive psychology: implications for professional education. Acad Med. 1996;71:988–1001.