In a paradigm-shifting decision, the National Board of Medical Examiners (NBME) and Federation of State Medical Boards (FSMB) announced on February 12, 2020, that the United States Medical Licensing Examination (USMLE) Step 1 will transition to pass/fail score reporting no earlier than January 2022.1 This change is meant to “reduce some of the current overemphasis on USMLE performance” and represents “a positive step toward system-wide change, while limiting large-scale disruption to the overall educational and licensing environment.”1 Despite this stated goal, there are broad implications of a pass/fail USMLE Step 1.
Medical students—and the entire medical education community—are anxious and also cautiously optimistic about the next steps of this opportunity to fundamentally reshape the entire transition to residency. The decision to make USMLE Step 1 pass/fail comes after years of advocacy to change the exam format. These efforts, including discussions held at the Invitational Conference on USMLE Scoring (InCUS) in March 2019, highlighted the negative impacts of using Step 1 scores as a stratifying metric in residency selection. The InCUS meeting attendees included students and residents, including the authors of this commentary. As the transition to Step 1 pass/fail scoring begins, we highlight opportunities resulting from this announcement and potential negative consequences for students, and we discuss important elements of the path ahead, most notably including learners in all aspects of the process.
Eliminating Step 1 Numeric Scores Removes Problematic Incentives for Medical Students
The use of Step 1 numeric scores as a stratifying metric in residency selection has profound effects on medical students and on medical education. The system incentivizes students to maximize their Step 1 score at the expense of all else, sometimes including their personal well-being. This focus on Step 1 scores is despite the absence of compelling evidence correlating Step 1 performance with clinical performance in residency. Within undergraduate medical education (UME), the importance of Step 1 has led to increased dedicated Step 1 study periods, a decreased emphasis in the curriculum on important preclinical content that is not tested on the exam, and support for an exploding market of third-party test preparation materials. The pressure to obtain a high Step 1 score also can have a perverse effect on medical student specialty choice. Many students perceive they cannot be viable candidates for residency positions in competitive specialties unless they obtain a certain Step 1 score. In one cohort of students, over one-fourth reported that their Step 1 score influenced a change in specialty choice.2 The exam has also been shown to impact professional identity formation, and low scores can have dire effects on students’ well-being.
Shifting Step 1 to pass/fail scoring removes the pressure to achieve a high numeric score. Thus, preclinical medical students will no longer feel compelled to focus their studies solely on Step 1 content. This can allow faculty to emphasize innovative curricula that expand beyond Step 1 content to teach the subject areas and skills important to train an effective physician workforce: social determinants of health and health disparities, effective communication strategies, cultural competency, leadership training, health care delivery models, humanistic medicine, and more. As important as these topics are, they are not typically tested on Step 1 so are often sacrificed in the interest of Step 1 prep in the current approach.
The use of Step 1 scores in resident selection also contributes to perpetuating disparities in medicine. Rubright et al3 found that racial and gender differences exist in Step 1 performance even after using Medical College Admission Test performance and undergraduate grade point average as covariates. Importantly, racial and ethnic disparities were also present for USMLE Step 2 Clinical Knowledge (CK) and Step 3 performance. One of the recommendations from InCUS was to “minimize racial demographic differences that exist in USMLE performance.”4 Given the racial and gender differences in USMLE performance, the continued use of Step 1 scores as a screening metric directly impedes efforts to diversify the physician workforce and works against holistic review in medical school admissions. Making Step 1 scoring pass/fail does not remove the imperative to minimize racial differences in USMLE performance, especially if the emphasis is transferred to Step 2 CK scores in graduate medical education (GME) admissions when numeric scores for Step 1 are no longer available.5 Pass/fail scoring for Step 1 does, however, force residency programs to develop new methods for assessing applicants. It is important that any such efforts deliberately seize this opportunity to incorporate holistic review concepts into the UME–GME transition.
During the public comment period for the preliminary recommendations released following InCUS,4 the authors of this Invited Commentary and other trainee attendees of InCUS submitted a joint letter urging the USMLE Governance and the FSMB and NBME Boards to seize this opportunity and act from a position of bold leadership to redefine the conversation around the UME–GME transition. Step 1 numeric scores are an imperfect tool used in an imperfect system, but their existence has reduced the pressure to create meaningful change to other components of the UME–GME transition. The decision to shift to pass/fail scoring for Step 1 irreversibly disrupts the transition to residency and forces all stakeholders to address deficiencies aggressively in the process. The courageous choice to enact this change in a 2-year window creates an ambitious but reasonable time frame for residency programs, medical schools, and national organizations to rapidly develop and implement innovations for the UME–GME transition.
Coping With the Loss of Step 1 Numeric Scores as a Screening Tool
While many medical students celebrated the decision to transition Step 1 to pass/fail scoring, the change also elicited understandable concern from others. So long as the ratio of applicants to residency program positions continues to increase, residency programs will have to use strategies to narrow applicant pools to a reasonable number for more detailed review. Many student populations, such as international medical graduates, DO students, underrepresented minority students, and students at “less prestigious” medical schools, have meaningful concerns that their applications are “screened out” based on their educational background. Despite the many problems created by using Step 1 scores outlined above, some students view obtaining a high Step 1 score as the only comparable metric that ensures one’s residency application is reviewed. In the absence of a Step 1 numeric score, attention turns to what other metrics may be used in the decision to “screen out” applicants. It is critical that any such metrics not disadvantage students, even if indirectly, based on race, ethnicity, sex and gender identity, socioeconomic status, personal or familial hardships, or immigration status.
After the elimination of one tool for “screening out” applicants, residency programs must avoid relying on another single metric—specifically the Step 2 CK score—as the primary stratifying metric for resident selection. Residency programs are increasingly requesting or requiring Step 2 CK scores before interviews with or ranking of applicants. It can be argued that, compared with Step 1, Step 2 CK measures knowledge more directly applicable to residency practice. Nevertheless, shifting our educational focus from prioritizing memorization of detailed basic science knowledge for Step 1 to prioritizing memorization of detailed clinical knowledge for Step 2 CK fails to address the inherent drawbacks of using a single standardized knowledge exam as a filtering metric, especially in light of known racial differences in Step 2 CK performance.3 If implementation of pass/fail scoring for Step 1 simply creates “Step 2 CK mania,” we will have failed all stakeholders in the UME–GME transition.
Residency programs could also begin to rely more on the Medical Student Performance Evaluation (MSPE) and medical school grades in resident selection. Ideally, objective descriptions across multiple domains of medical student competency and professional trajectory would provide robust data for residency programs. Ongoing efforts to improve the MSPE have led to increased satisfaction by MSPE authors and program directors, but there are continuing concerns with variability across schools.6 Clinical clerkship grading is not free of problems either. Variability in grading systems amongst schools (e.g., normative systems versus criterion- or competency-based systems) make grade comparison impossible. Furthermore, clerkship grading is often subjective and susceptible to implicit and explicit biases. One institution’s analysis of MSPE summary words showed students who identified as white or female, were younger in age, or had higher Step 1 scores consistently received higher final clerkship grades.7 Additionally, inconsistent reporting in MSPEs amongst schools introduces significant undue bias in residency selection.
Finally, there is concern that the absence of Step 1 numeric scores will require applicants to use networking to gain an advantage in the residency selection process. For example, relying on away rotations at the residency program’s institution as the only means to screen in applicants would be distinctly problematic. Away rotations place substantial financial burden on often debt-laden medical students, and specifically create a disadvantage for students from lower socioeconomic backgrounds. There is also variability in medical school flexibility for students to complete away rotations, and prioritizing time for away rotations takes away from other vital parts of the UME curriculum. Another concern is that only students with well-connected mentors and advocates will be able to receive residency interviews, disadvantaging students who lack this network.
As replacement evaluation tools are considered and developed, it is essential that they do not disadvantage “at-risk” students. The worst-case scenario following the Step 1 scoring change would be that an increased number of students are filtered out based on inferior, or even unethical, metrics.
Charting a Path Forward
In an ideal scenario, residency programs would have the time, resources, and staffing to review each student’s application holistically to optimize the “applicant-residency fit.” This fit would not only align values and academic/professional interests but also pair students with programs that highlight their individual strengths and provide meaningful support to improve their weaknesses. Critically, residency programs would have to provide clear information about the program’s core values and academic opportunities. Residencies would also need to disclose characteristics not only of applicants who succeed but also those who struggle in their program.
Medical schools would have to accurately present their students’ strengths and weaknesses in a manner that recognizes every trainee has room to grow and does not penalize students who have had to overcome adversity to reach their current position. Moreover, without the relentless need to optimize curricula to align with Step 1 content, schools could leverage unique aspects of their institutions within their educational mission to train students with distinct experiences. Schools would also need to create mechanisms for thoughtful, specialty-specific advising tailored for each of their students.
Medical students would have to do their part in combating application inflation by reducing the total number of applications submitted in exchange for knowing their application would receive fair evaluation at each program. Mechanisms to minimize applications would be necessary, whether through an early match process or through graduated application caps during the existing application process (e.g., limiting the initial number of applications but allowing further applications after 1 month).
This idealized scenario would also include a fair and ethical interview process, eliminating such practices as programs offering more interviews than slots and students hoarding interviews. Programs and students would also need to meaningfully commit to the National Residency Matching Program (NRMP) postinterview communication guidelines. Finally, this system would need to have expanded support for unmatched students and actively destigmatize the Supplemental Offer and Acceptance Program.
Achieving any of these outcomes requires substantial resources, collaborative effort, and openness to change from all stakeholders in the UME–GME transition, including the Association of American Medical Colleges and NRMP. A broader system-wide review of the UME–GME transition is being led by the Coalition for Physician Accountability and includes learner representatives.1 We are also excited by existing efforts, including the Association of Professors of Gynecology and Obstetrics’ projects to create standardized interview offer practices and develop a limited early match program.8 We anticipate a need for many parallel initiatives during this critical period of transition to pass/fail scoring for Step 1. We strongly recommend including learner representatives as initiatives are conceived, reviewed, and continued or stopped. By ensuring the involvement of all stakeholders, including trainees, we can build trust among students, medical schools, and residency programs that will serve as the foundation of a vastly improved UME–GME transition.
The existing landscape of the UME–GME transition has made the inappropriate use of Step 1 scores as a screening metric nearly inevitable. The ambitious announcement by the NBME and FSMB to shift Step 1 to pass/fail score reporting beginning in 2022 amplifies the broader challenges in the transition to residency. Navigating this critical period will require implementation of complex changes and initiatives by all stakeholders—medical students, medical schools, residency programs, and national organizations—and is extremely unlikely to occur without missteps along the way. Support for students who are negatively affected is essential. Decisions will need to be made with common definitions and goals. Including medical students in these decisions will be critical for optimizing successful initiatives. All parties will need to operate with the grace that assumes the best intentions of the other stakeholders. Together, we can reinvent the UME–GME transition, enabling holistic review in residency selection and protecting students from negative ramifications of screening metrics. Over the next 2 years, we can lay a foundation of open communication and trust for this vital transition in the training of the next generation of physicians.
The authors thank Janice L. Farlow, MD, PhD, and Neil Gesundheit, MD, MPH, for their helpful revisions and comments on this commentary.
1. United States Medical Licensing Examination. Invitational Conference on USMLE Scoring. Change to pass/fail score reporting for Step 1. https://www.usmle.org/incus/#decision
. Accessed April 29, 2020.
2. Khan M, Gil N, Lin W, et al. The impact of Step 1 scores on medical students’ residency specialty choice. Med Sci Educ. 2018;28:699–705.
3. Rubright JD, Jodoin M, Barone MA. Examining demographics, prior academic performance, and United States Medical Licensing Examination Scores. Acad Med. 2019;94:364–370.
4. United States Medical Licensing Examination. Summary report and preliminary recommendations from the Invitational Conference on USMLE Scoring (InCUS), March 11-12, 2019. https://www.usmle.org/pdfs/incus/incus_summary_report.pdf
. Accessed April 29, 2020.
5. Chaudhry HJ, Katsufrakis PJ, Tallia AF. The USMLE Step 1 decision: An opportunity for medical education and training [published online ahead of print March 6, 2020]. JAMA. doi:10.1001/jama.2020.3198.
6. Hauer KE, Giang D, Kapp ME, Sterling R. Standardization in the MSPE: Key tensions for learners, schools, and residency programs [published online ahead of print March 10, 2020]. Acad Med. doi:10.1097/ACM.0000000000003290.
7. Low D, Pollack SW, Liao ZC, et al. Racial/ethnic disparities in clinical grading in medical school. Teach Learn Med. 2019;31:487–496.
8. Hammoud MM, Andrews J, Skochelak SE. Improving the residency application and selection process: An optional early result acceptance program. JAMA. 2020;323:503–504.