Human beings are fortunate to differ significantly from one another in terms of cognitive abilities. But, the seemingly incidental differences in these skills can have a surprising impact on a student’s ability to pass a standardized examination even when failures on similar examinations have not occurred in prior years of education. Even after acceptance into medical school, successful completion of basic medical science courses, and ratings in the acceptable range on clinical performance evaluations, a failure to pass the United States Medical Licensing Examinations (USMLE) Step 1 or 2 administered by the National Board of Medical Examiners (NBME) occurs in a significant portion of students. Recent data based on the computerized administration of the Step 2 examination revealed that approximately 30% of students fail on their first attempt and that 10% fail on their second attempt.1 Although medical educators understand that students should be allowed to fail as part of their education, repeated failures may result in the student being asked to leave medical school. Multiple failures can also result in significant psychiatric distress and anxiety, a loss of confidence in abilities, and an increase in the cost of the student’s medical education.2
Some of the students who fail the Step examinations multiple times have no documented history of a learning or attention problem, and therefore they cannot successfully attain accommodations for the Step 1 or 2 examinations. Before the pilot study described in this report, a systematic review of the literature, using PubMed in November 2007 and again in March 2008 and a range of search terms (such as NBME, USMLE, Step examinations, reading, performance, medical school), produced only limited information about the specific personal and cognitive characteristics of the students who experience multiple failures on these examinations. All relevant articles from the review of literature were included in this study.
De Champlain and colleagues1 created a model of the passing rates for the computer-based Step 2 examination, using more than 10,000 examinees who all took the test in a standardized manner. This 2004 study, examining factors that influence pass rate on the Step 2 examination, is the most recently published study of this type, and it has the largest sample size. Although it did not provide specific details about the ethnic background, clinical or academic performance, or specific medical school location of the students who failed the examination, it does report that odds for passing the examination were 2.7 times greater for graduates of U.S. and Canadian medical schools and 2.1 times higher for examinees reporting that English was their primary language.1 This research suggests that students who initially learned a language other than English have a significantly reduced rate of passing the Step 2 examination. An ability to read English proficiently is reasonable cause for better performance, given that the Step 2 examination is a written, multiple-choice examination describing details of a patient’s medical condition and requiring answers concerning diagnosis and treatment.
An older study, published in 1994, reported that a student’s minority status was related to his or her examination performance. In this very large sample of students taking the NBME Part 1 examination for the first time in 1987 or 1988, the pass rate was 88% for whites, 66% for Hispanics, and 49% for African Americans.3 This study documents racial and ethnic differences in performance on the NBME Part 1 examination. The similarities between the current Step 1 examination and the 1980s Part 1 examination suggest that a student’s minority status may continue to influence test performance.
In a smaller study, researchers compared test performance data from 42 underrepresented minority students with data from 368 other nonminority medical student students.4 In this retrospective study, significant differences existed between the underrepresented minority students’ performances and other students’ performances on the Step 1 and 2 examinations and between the two groups’ respective MCAT scores, but not between the two groups’ clinical rating scores, which the students earned during a family practice clerkship. The authors suggest that the difficulties that the underrepresented minorities were experiencing on the standardized examinations did not reflect a knowledge difference but, rather, a difference in their response to the standardized testing situation.4 Although the authors felt that the difference was related to the students’ abilities to read, process the written material, and respond in a timely manner on the standardized examinations, they did not obtain or measure competency in English.4
In examining the performance of Asian and Pacific Islander medical students, Kasuya and colleagues5 found that a minority student’s MCAT score tended to overpredict his or her performance on the Step 1 and 2 examinations during medical school. This was significantly different from the comparison sample of white students in which the MCAT score underpredicted their later test performance. Therefore, given equal entrance MCAT scores, this minority sample did not generally do as well as expected on the Step 1 and 2 examinations. The authors suggest that researchers should investigate additional factors that influence student success in medical school, such as reading and test-taking skills, in addition sociocultural influences on learning.5
Xu and colleagues6 found similar results when they compared 140 Asian American graduates with 2,269 white graduates from a medical college between the years 1981 and 1992. The nonminority graduates had significantly higher scores on the MCAT reading subtest and all national board examinations. The authors observed no difference between the two groups’ performance ratings during their first year of residency. For the Asian American students, the MCAT reading score was the major predictor of later performance on the Step 1 and 2 examinations (called MBME Part I and II at that time) but was not a predictor of clinical performance in residency.6
The systematic literature search described above revealed that very limited published information about the psychometric characteristics of the current NBME-administered USMLE Step 1 and 2 examinations exists. A 1996 study examined the relationships among premedical school variables, including the verbal and math Scholastic Aptitude Test (SAT) scores, MCAT scores, undergraduate major, and grade point average in required premedical science courses, and Step 2 performance.7 The authors compared the premedical school variables statistically with Step 2 performance in 323 students and cross-validated the results with 157 students.7 Both regression equations revealed that the best predictors of Step 2 performance were the students’ SAT Verbal score (r = 0.317) and the Reading section of the MCAT (r = 0.331) administered before undergraduate medical education.7 The characteristics of both significantly predictive tests include fluent reading and comprehension of complex linguistic material; neither involves a focus on science or medical concepts. This cross-validated study suggests that general language knowledge, reading abilities, and test-taking skills on a primarily reading measure are more related to performance on the Step 2 examination than measured abilities in science or math.7
Given the importance of adequate reading for success on the Step examinations, Haught and Walls8 asked 730 medical students to take a standardized reading test (Nelson-Denny Reading Test) during orientation to medical school. Stepwise regression analysis of their performance demonstrated that the reading test significantly predicted the subsequent Step 1 examination score (P < .01). In this study, underrepresented minorities comprised only 10% of the sample and were not analyzed separately, so whether ethnic background was a moderating variable in the relationship between measured overall reading skills and Step 1 performance remains unknown. Haught and Walls8 suggested that obtaining a formal measure of reading from students during the orientation process might assist in identifying students who will either be successful or have difficulties in their later years of medical school.
More recently, a popular news magazine has stated that undergraduate humanities majors, such as English majors, perform better on the MCAT than science majors.9 Further, the article comments that medical schools are now admitting an increasing number of nonscience majors to obtain “well rounded” doctors.9 Given the evidence presented above, these medical students are the same students who will likely do better on USMLE Step examinations because of high verbal processing skills related to their humanities backgrounds. The reliance on the Step 1 and 2 examinations has begun to change the balance of cognitive strengths in recently admitted medical students; this news report suggests that students who are strong in English or humanities and proficient in verbal processing are now gaining admittance more often to medical schools and that the analytical, scientific, quantitative thinkers who have majored in science have become less desirable medical student applicants.
Although studies support careful monitoring and counseling of medical students with test failures,8 only one of the studies demonstrated effective treatment of students who had multiple failures on the Step 1 or 2 examinations. Powell2 summarized his experiences from a 16-year period treating medical students diagnosed with debilitating test anxiety (DTA). These students had experienced at least two failures on the Step examinations (N = 72). Seventy-four percent of the students who met his criteria for DTA responded positively to his structured, supportive behavioral treatment and passed their next examination. He states that the addition of psychoeducational techniques such as frequent pre- and posttesting on the practice examinations further enhanced his therapeutic efforts, resulting in a higher pass rate of the students receiving his therapy. This suggests that success on the Step 1 and 2 standardized examinations may relate to both cognitive and emotional factors. Neuropsychological testing research has shown that psychiatric conditions such as anxiety and depression regularly impact speed of processing, memory, perception of details, and attention.10 A decline in these basic cognitive skills due to a psychiatric condition may have an influence on Step 1 and 2 examination pass rate.
The Step 1 and 2 examinations administered by the USMLE require sufficient reading fluency for completion of all items in each section. Both tests, administered on the computer and closely timed, are currently administered in 30- to 60-minute blocks, according to the USMLE Web site (www.usmle.org). Students generally complete each test in four hours. The actual number of words contained in each testing block for the Step 1 or 2 examinations could not be obtained from the testing development personnel at USMLE; consequently, the actual reading fluency rate necessary for completion of all items on the Step 1 and 2 examinations is unknown. The structure of these multiple-choice tests requires reading accuracy, sufficient reading fluency, adequate short-term memory, sustained attention, and reading comprehension to allow for the student to demonstrate his or her knowledge of the medical concepts. Again, other factors such as the presence of anxiety and depression, prior verbal abilities as measured by standardized language tests, English as the primary language, and ethnic background may also influence passing rates on the Step examinations.
This pilot study examines the characteristics and treatment effectiveness of a select group of medical students with multiple failures on the Step examinations. It involves a small number of medical students (six) with a history of a minimum of one failure on the standardized USMLE Step examinations who were referred to a department of neurology and rehabilitation for treatment. These students had no prior history of receiving special educational services for learning or attention problems at any point in their educations. Two of the students attempted to obtain accommodations from the USMLE for the Step examinations but were unsuccessful. As detailed below, this study demonstrates the potential effectiveness of employing a well-established rehabilitation psychology technique, cognitive rehabilitation (CR), also described below, to improve the cognitive aspects of standardized test taking.
This sample derives from a large urban medical school in which 21% of the students are underrepresented minorities. A total of 14 medical students and residents who failed a USMLE Step examination at least one time had received a referral for rehabilitation psychology treatment at some point during the past four years (2003–2007) because of suspected attention or learning difficulties as evidenced by examination failures. The six students selected for this retrospective study met the following criteria:
* At least one failure on a Step examination
* Absence of accommodations for the Step examinations
* Completion of an evaluation of reading, linguistic processing, memory, and mood
* Completion of at least six treatment sessions
* No known neurological history.
Six individuals met the above criteria. I excluded the other eight students because of a limited number of sessions and/or the presence of a neurological condition, and no follow-up data on these eight students are available. Table 1 presents demographic data for the six selected students. All participants completed a one-on-one diagnostic interview with the author and the following tests:
* Reading Fluency and Passage Comprehension from the Woodcock-Johnson–III Achievement Tests (which provide standardized measures of reading skills),11
* the Stroop Color and Word Test (which is a neuropsychological test measuring attention and linguistic processing),12
* the Wechsler Memory Scale–III Logical Memory I and II (which includes a short-term and long-term verbal linguistic memory test),13
* the Rey Complex Figure Test (which measures visual memory for a complex abstract design),14
* the Wechsler Memory Scale–III Family Pictures I and II (which test visual memory for pictures),13 and
* the Beck Depression Index15 (which measures symptoms associated with depression).
Either I or a training clinical psychology graduate student completed the individualized, formal evaluation of each student, which took from two to four hours.
The University of Illinois at Chicago Office for the Protection of Research Subjects granted approval for this study.
Each of the six participants received individualized treatment that was based on the results of his or her evaluation and focused on strengthening his or her demonstrated cognitive weaknesses, according to the results of the neuropsychological and academic tests administered. Treatment sessions were one hour in length, and generally the participants’ medical insurance covered the costs. The six participants attended an average of 11 sessions (range 6–20; Table 2).
The model of treatment is based on known, published principles of CR that have been developed for treatment of individuals with neurological and psychiatric conditions, such as schizophrenia.16 A broad, generally accepted definition of CR is the systematic use of well-defined structured activities (such as two described below) designed to improve higher cerebral functioning in a person with brain injury, or to help the individual accommodate his or her deficits by teaching methods of compensation.17 The therapy involves systematic application of CR tasks within the individual’s areas of weakness, development of compensation strategies enhancement of processing skills through repetition of tasks, and increased awareness of cognitive strengths and weaknesses. Because I have worked extensively with tools used to rehabilitate patients who have deficits in reading fluency, visual scanning, and short-term verbal memory due to a history of brain injury, I applied those same tools to high-functioning, motivated, non-brain-injured medical students. Given their overall cognitive status, they adapted to the tools with ease and required significantly fewer sessions than the more typical 20 to 30 sessions I usually provide to individuals with acquired brain injury.
A typical rehabilitation psychology session with a medical student involved both supportive psychotherapy and work with the CR tasks. In the psychotherapy portion of the session, the students and I discussed their progress in their studies, their current study plans, and the results of recent practice tests. We also talked, as needed, about issues within the student’s life (i.e., sleep problems, stress, family responsibilities) that were interfering with studying, and we discussed any necessary modifications and adjustments.
Four of the participants in the sample were not taking classes or attending clinics at the time because of their need to pass the examination before advancing in their medical school education.
The CR tasks these participants used were all computer based and, therefore, provided an accurate way of monitoring speed and accuracy across treatment sessions. Given that students now complete the Step examinations via the computer, I felt that providing reading and memory treatment through this means was especially relevant. I addressed reading fluency in all six participants. The normative data obtained from the Woodcock-Johnson Reading Fluency test indicated low average or impaired reading speed in all six participants (Table 2). A majority of the students indicated that on their prior failed Step 1 or 2 examinations, they did not complete all questions and needed to bubble in answers at the ends of some sections. Therefore, improving reading fluency was a central goal of therapy. I used two tasks to address reading fluency: a horizontal word search task and a tachiscopic reading test (Fast Read).18 Both tasks require sustained attention and rapid processing of linguistic stimuli. The tasks promote accurate “whole word reading.”
Whole word reading is essential for college-level word-per-minute reading speed.19,20 A shift in the reading process that involves a transition from processing individual words (letter-by-letter reading, or sounding out) to rapid, accurate processing of the word as a whole and groups of word19 generally occurs in children who primarily speak English between fifth and seventh grades. This prerequisite skill allows for efficient construction of meaning from text while reading. This skill is dependent on exposure to printed words and is often found to be deficient in individuals with reading disorders or for whom English is a second language.2 A lack of exposure to reading in English due to later introduction of English can result in difficulties with whole word fluent reading.21
In their rehabilitation sessions, the medical students were exposed to multiple repetitions of two computerized reading tasks, generally beginning with a horizontal word search task.22 This task involved horizontal scanning for a target word on a full-page array of closely spaced random letters. If scanning is too rapid to perceive the target word or if attention is inadequate, then the student’s time to find the target is slow because of a need to search the array multiple times. The goal of this task is to enhance accurate and rapid visual scanning of linguistic material across a sustained period of time (10–15 minutes). After the participant improved on this task, he or she began the Fast Read task.18 This computerized task entails rapid and accurate processing of three common one- or two-syllable English words. The task is configured to present 25 trials at an initial presentation rate within the student’s capability. The task is designed to present the next three randomly selected words slightly faster if the student accurately reads the words and slightly slower if the student cannot read the written words within the time limit. One-syllable words were presented initially, and, as the medical student reached criteria, two-syllable words were presented. The task was often administered two to three times during a therapy session. In addition to addressing reading speed and accuracy, both of these tasks provided practice in sustained attention (both tasks require more than 10 minutes to complete). In highly anxious students, the computerized tasks provided an opportunity for the student to practice relaxation techniques (such as deep, slow breathing) while performing challenging reading tasks under time pressure.
I also addressed memory in four of the participants (Table 2). Adequate performance on the Step examinations requires both verbal and visual memory abilities. Adequate short-term verbal memory is especially essential in individuals who read slowly because the laborious letter-by-letter reading extends the necessary retention time, and short-term memory is both time and capacity limited, which reduces recall of what is read.19 As the student’s reading fluency improves, so should his or her retention of the written text. Efficient short-term memory of salient details across the lengthy text within the Step examination questions is necessary to reduce the need to return to the text to check for specific details when answering the multiple-choice questions. Rehabilitation tasks involving rapid presentation of lists of unrelated words, therefore, enhanced immediate memory for linguistic material.18 The students learned and practiced advanced memory strategies such as visualization of words, clustering of similar words, and elaboration of words into sentences or stories to enhance recall of the words. Additionally, a portion of the questions on the Step examinations involves illustrations, figures, or graphs, and adequate short-term visual memory is essential. To enhance visual memory, students completed tasks that involved short-term recall of complex shapes and visual patterns.18,22 Again, they learned compensation strategies such as verbal elaboration of the figures to enhance visual memory. Finally, one student (participant 5) had specific difficulties with rapid processing of numbers and became anxious when exposed to numbers, so I also included exposure to rapidly presented simple math tasks in this student’s therapy.
I devoted the final five minutes of each therapy session to reviewing the compensation strategies discussed and discussing how to apply these strategies to ongoing study efforts. For example, if the participant had difficulties with recall of essential verbal material after reading practice questions, he or she applied the memory strategy (such as elaboration or visualization) found to be effective on the verbal memory tasks during rehabilitation to his or her studies. I strongly recommended that all students complete practice tests and that, at least one time per week, they closely time themselves using standard time limits so that they would be aware of how quickly they needed to read and process each question to fully complete the section. I also urged all participants to read and reread full texts of medical material, rather than focusing on condensed summaries of medical subject areas, because research shows that reading fluency improves from repetitive exposure to text.19
All students in this sample are from minorities underrepresented in U.S. medical schools, and all are at least bilingual (one participant is fluent in three languages). Table 1 shows the number of years in which the participants lived in the United States; 50% of the participants received all of their formal education in the United States, and all students except participant 1 were then obtaining their medical education in a U.S. program. That student obtained his/her undergraduate medical education in England and was a resident in a U.S. medical school when difficulties passing the Step examinations first occurred. The mean age of students in the medical school from which the students were referred is 22.4 years old. The ages of the referred students were all greater than the mean, even after taking into consideration their ages at the time of entrance into medical school.
The formal testing revealed that all participants’ intellectual levels were within the average to superior range. Their reading fluency, as measured by the Woodcock-Johnson Achievement Test–Reading Fluency, was significantly discrepant from their intellectual level (Table 2). Three participants tested as having relative weaknesses in verbal and visual memory on the Wechsler Memory Scale–III or the Rey Complex Figure Memory Test; therefore, I addressed memory in CR. Half of the participants also completed a full neuropsychological learning disability evaluation to determine whether they had a learning, attention, or memory disability that may have been affecting their test performance before treatment, and, in these cases (participants 1, 5, and 6), I could address a number of other cognitive deficits (Table 2).
The adult performance criteria on the Fast Read task, determined through testing of nonimpaired adults using the standard configuration described above, is a best speed of <85 and accuracy >75%. Because of the task’s design, 100% accuracy is impossible in the 25 trials. Speed indicates the amount of time that the stimuli are presented (lower best speed is a faster presentation rate). Even with relatively limited exposure to Fast Read, this task resulted in noticeable improvements in both reading accuracy and reading speed. Four out of six participants obtained final scores within established adult criteria for reading speed. The two participants who did not meet the established criteria, but experienced some improvements, had the lowest standard scores on the reading fluency test administered before CR (Table 2). Given the limited number of therapy sessions, I could not readminister the Reading Fluency test from the Woodcock-Johnson Achievement Test–III. Anecdotally, all participants felt that their reading speed did improve on their timed practice examinations because they could complete more practice questions within the standard time limit. This served to increase their confidence in their ability to read rapidly and efficiently on the computer. Additionally, participants experienced improvements on the verbal and visual memory computerized tasks administered during therapy.
From the students’ perspective, the most important measure of therapy success was passing their next Step examination. Five of the six students passed the examination immediately after CR. Participant 6 was unable to pass the Step 1 examination even after 14 therapy sessions and three attempts on Step 1. Matching in specific residency or postresidency training programs is also an important part of medical school success. Only two students have progressed sufficiently to determine their posttraining programs: participant 3 matched in a competitive surgery residency, and participant 1 obtained employment in a postresidency specialty program.
Discussion and Conclusions
The number of available participants limits this study; only 6 of the possible 14 participants met all historical and treatment criteria established to obtain group cohesiveness. Additionally, randomization of treatment did not occur because the students were referred for specific treatment, and, given the very stressful and expensive nature of their situation, placing participants in a nontreatment group would have been unfair for a number of reasons. Because this study did not involve a nontreatment group, determining whether a portion of the students would have passed their next examination without treatment is not possible. Finally, given time constraints experienced by the participants, retesting after CR did not occur. Still, this study offers pilot data suggesting that specific CR techniques may be beneficially applied to medical students who have failed the USMLE Step examinations but who have no documented history of learning or attention problems. The success of such a short treatment program is most likely attributable to a combination of factors:
1. enhancement of cognition via the rehabilitation therapy,
2. a change in the students’ study techniques, and
3. a general therapeutic effect of the supportive treatment.
A rehabilitation treatment addressing specific cognitive and academic weaknesses identified through standardized testing should be considered as a potential intervention for medical students with multiple test failures on USMLE examinations.
In addition, the academic medicine community should closely examine the heavy reliance on the USMLE Step examinations to determine progress in medical school and residency. Even 10 years ago, Williams criticized the sole use of the USLME Step 1 and Step 2 scores to determine program effectiveness and student competencies, and he encouraged the development of objective, standardized measures of clinical competencies.23 The use of a language-based reading test that requires rapid reading and comprehension skills places specific types of students at a disadvantage when they attempt to pass the Step examinations for the first time. The limited data from this study, along with those from the literature, suggest that students who generally have more difficulties passing the Step examinations are from underrepresented minorities and are those individuals for whom English is not the primary language. Research has found that the development of fluent reading skills, such as whole word reading, is hampered when students do not have sufficient early exposure to English text.21
In the United States, the most recent 2000 census data indicate that the portion of Americans who speak English poorly or not at all has grown by nearly 60% since 1990. Currently, nearly seven million Americans speak little or no English.24 Therefore, physicians who are bilingual are extremely valuable, given their ability to communicate in a variety of languages needed by Americans to obtain optimal medical care. Unfortunately, some evidence shows that these gifted medical students who speak more than one language have greater difficulty passing the USLME Step examinations and, at times, are asked to leave medical school because of repeated failures on the Step examinations. The academic medicine community must not only decrease its reliance on the USMLE Step examinations to determine which medical students will progress to graduation but also provide CR or like opportunities for those talented, hard-working students who do not initially pass their examinations but who have the potential to become caring, effective physicians.