On Step 1 Mania, USMLE Score Reporting, and Financial Conflict of Interest at the National Board of Medical Examiners

Carmody, J. Bryan MD, MPH; Rajasekaran, Senthil K. MD

Author Information
doi: 10.1097/ACM.0000000000003126
  • Free


Recently, in the pages of this journal, several authors debated the score-reporting policy for the United States Medical Licensing Examination (USMLE).1–3 Although the test was originally intended to assist state medical boards in making a binary determination regarding licensure, USMLE Step 1 scores are frequently used to screen applicants for postgraduate medical training programs, despite the lack of good evidence demonstrating that these scores predict success in residency. Some have suggested that reporting scores as pass/fail would honor the test’s purpose without encouraging inappropriate use.1

This issue is not new: for decades, there has been debate regarding whether the USMLE (and its predecessor exams) should report numeric scores or a pass/fail designation.4,5 While those engaged in these debates1–5 have frequently considered the consequences for stakeholders such as medical students and residency program directors (PDs), they have not explicitly considered the financial consequences such a change might have on the test’s sponsoring organization.

A financial conflict of interest (COI) has been defined as “a set of conditions in which professional judgment concerning a primary interest … tends to be unduly influenced by a secondary interest (such as financial gain).”6 Here, we explore the possibility that financial incentives could be influencing the National Board of Medical Examiners’ (NBME’s) position on USMLE score reporting.

In this Perspective, we examine the history and mission of the NBME and its licensing exams, trace the rise of “Step 1 mania,” review the financial incentives that provide a competing secondary interest in the NBME’s organizational prioritization and decision making, evaluate whether these might constitute a problematic COI, and consider possible remedies.

The Original Mission of the NBME

The NBME was founded in 1915 to address the lack of a nationwide, standardized examination for physicians. At that time (and still today), physicians practicing in the United States had to be licensed at the state level, so a physician who was qualified in one state may still have been required to take a separate exam in another state. Dr. William Rodman, then president of the American Medical Association, stated the NBME’s reason for being in a 1916 address:

Here was a man who had graduated at Harvard Medical College, passed the Massachusetts State Board of Health, had achieved an enviable position in Boston, was called to New York to take a more important position in [an] institution there, passed the State Board of New York and then on account of his commanding position in thoracic surgery, was called to preside over this department at the Mayo Clinic, and he had to take the Minnesota State Board examination before he could practice medicine. Is that right? Is there any man who will say that it is right?7

By creating an examination whose results could be accepted by all state medical boards, the NBME aimed to ease interstate reciprocity for licensure and allow a qualified physician to seek licensure in any state without the need for further testing.

History of the NBME’s Exams

The first NBME examination was administered in 1916. Originally, participation in the NBME exam was completely voluntary. Only 16 states recognized the NBME’s certificate, and the scope and content of the NBME’s exam made it more difficult to pass than most individual state licensing exams. Until 1922, NBME examinations were “weeklong extravaganzas, incorporating essay, laboratory, oral, practical, and bedside components.”8 Examinees achieving a passing score were rewarded with a gold key, which some chose to wear on their person as a badge of honor.9

By 1954, the NBME had abandoned the bedside, oral, laboratory, and essay components of their earlier exams due to concerns over reliable and reproducible examination scoring. In their place, the NBME unveiled new, norm-referenced tests based entirely on multiple-choice questions. The new NBME Part I and II examinations were widely accepted by state medical boards. By the 1960s, state-sponsored licensing examinations disappeared and 2 parallel pathways for initial licensure had emerged: the NBME exam and the Federation Licensing Examination (FLEX).

Fractures became apparent in this 2-test system by the late 1980s. Approximately 75% of U.S. and Canadian medical graduates chose to take the NBME exam.10 However, certain state boards—such as those in Texas and Louisiana—did not honor it. Additionally, because international medical graduates were not permitted to sit for the NBME exams, they were required to take the FLEX. U.S.-citizen medical graduates of international medical schools felt that the latter examination represented a higher bar, and they alleged that state boards made these requirements more stringent in order to limit the entry of non-U.S.-trained physicians into the United States. Several state legislatures—most notably those in California and New York—were sympathetic to these claims and considered implementing legislation that would require a single examination for all licensure applicants.11

In response, the NBME and the Federation of State Medical Boards (FSMB)—with support from other stakeholder organizations—moved to create a jointly sponsored, single examination pathway for physician licensure: the USMLE. Steps 1 and 2 of the USMLE would be based on, respectively, the NBME Part I and II exams, and they would remain the primary responsibility of the NBME. Step 3 of the USMLE would be based upon elements of the FLEX’s Component 1 and Component 2 and remain under the administration of the FSMB.10

The creation of the USMLE in 1992 represented a fulfillment of the NBME’s original mission—and provided the NBME with a market monopoly on licensure testing for graduates of MD-granting medical schools. Since then, the increasing importance of the test has led to multiple problems in medical education, including the rise of Step 1 mania.

The Rise of Step 1 Mania

Originally, the NBME chose to report examination results as numeric scores rather than a pass/fail designation due to the belief that this feedback benefitted students. As noted in an NBME publication in 1937:

The use of grading in examinations deserves special consideration. A mere distinction between passing and failing may satisfy the poor student, but even the mediocre student is entitled to know how far he escaped the noose.12

Similarly, when the NBME considered the issue of pass/fail score reporting in 1990, it chose to continue reporting numeric scores due to “an obligation to provide examinees with knowledge of how their performances compare with passing scores,”—even while noting concerns regarding the use of scores for other purposes.4

As NBME exams gained broader market share and state licensing exams disappeared, residency training programs received more and more applications from candidates who had all taken the same test. This development made it possible to use licensing exam scores as a common measure to compare applicants from different educational backgrounds. Nonetheless, most PDs valued other factors well above licensure scores. For instance, a 1979 survey of PDs found that NBME Part 1 exam scores ranked 23rd out of 31 factors in ranking candidates.13

Having a broadly applicable screening tool became more useful to PDs as the number of applicants began to outpace the number of available positions. In 1980, for example, the National Residency Matching Program (NRMP) offered 18,055 positions, but just 15,129 applicants sought them (1.19 positions per applicant). By 1985, the number of applicants had risen to 22,386 for only 18,535 positions (0.83 positions per applicant).14

Initially, the NBME warned against the use of its exams for residency selection (emphasis in original):

The National Board Part I and Part II examinations provide measurements of the basic medical science and clinical science knowledge of individual students. It is important to understand, however, that the examinations have not been developed for the purpose of assessing preparation for postgraduate education. Appropriate use of these test scores, for whatever purpose, also requires recognition of certain limitations of evaluation instruments of this type.15

The late 1980s and early 1990s saw a temporary increase in the number of residency positions: from 1989 to 1992, there were again more matched positions available than applicants. However, by the mid-late 1990s, applications again exceeded available positions—a condition that has persisted ever since, with around 0.8 positions per applicant from 2009 until 2019.14 For PDs evaluating a large number of applicants, the objectivity of licensing examination scores held obvious appeal, and, following the establishment of the USMLE as the single licensure pathway in 1992, these scores could be used as a universal metric for screening applicants.

In 2008, following a comprehensive review of the USMLE program, the NBME’s official position on secondary uses of USMLE scores changed. Instead of cautioning against their use for nonintended purposes, the Committee to Evaluate the USMLE Program officially sanctioned such secondary uses, “provided that they do not compromise the primary purpose” of the test in making decisions regarding initial medical licensure.16 Since then, the use and importance of Step 1 scores in residency selection has only increased.

In 2010, 73% of PDs across all specialties reported using applicants’ USMLE Step 1 score when determining whom to interview.17 By 2018, this figure rose to 94%.18 PDs cite no other factor more frequently as being relevant to their decision to offer an interview, and they cite few as more impactful.

We believe that the increasing student focus on Step 1 is evidenced by rising examination scores. In 1993, the mean score on the USMLE Step 1 was 200, with a standard deviation (SD) of 20. In 2018, the mean score was 230 (SD, 19) for test takers from U.S. and Canadian medical schools; a score of 200 would have fallen at the 8th percentile.19 As student scores on Step 1 have risen, the minimum passing score for Step 1 has increased 5 times since 199720–24 (Figure 1)—despite the absence of meaningful data demonstrating that each prior standard was failing to screen out incompetent practitioners. Indeed, despite increases to the minimum passing score, the initial pass rate for U.S. medical students on Step 1 is at an all-time high (94%–95% for U.S. test takers over the past 5 years).25

Figure 1
Figure 1:
United States Medical Licensing Examination Step 1 national mean score and minimum passing score, 1993–2018.20–25

Harms of Step 1 Mania

Having fulfilled its original purpose, in 2001, the NBME underwent a corporate restructuring and redefined its mission as being “to protect the health of the public through state of the art assessments of health professionals.”26 While physician licensure is an essential part of protecting the public, we are skeptical that Step 1 mania has benefitted society. We are unaware of any data demonstrating improvement in any public health metric due to rising Step 1 scores; rather, evidence is mounting that Step 1 mania may be undermining the NBME’s core mission by disadvantaging some test takers, unduly influencing graduates’ career options, restraining curricular innovation, and causing students to prioritize memorization over wellness and long-term learning.

Though there has been a national call to increase the diversity of the physician workforce, the current use of USMLE Step 1 scores in residency selection may disadvantage women and minorities underrepresented in medicine. According to an NBME study of first-time test takers from 2010 to 2015, female test takers scored 5.9 points lower on Step 1 when compared with a white male reference group, and test takers self-reporting as black and Hispanic scored, respectively, 16.5 and 12.1 points lower.27 While the authors noted that the magnitude of these differences decreased after accounting for undergraduate grade point average and Medical College Admission Test (MCAT) scores, we believe that such adjustments would not occur in practice. Sixty-four percent of PDs across all disciplines use a specific target score for determining candidates to interview,18 and PDs who use Step 1 scores for screening seem unlikely to apply a complex statistical adjustment to the raw score.

Studies suggest that medical students select or alter their career choice in response to Step 1 scores.28,29 In some cases, students are effectively eliminated from consideration in certain residencies based on Step 1 scores alone.30 Because the test was neither intended nor validated as a vocational aptitude test, these USMLE score–influenced decisions seem unlikely to benefit the public. Similarly, the financial cost of test preparation (conservatively estimated to exceed $7.5 million [M] annually31) may increase student loan debt. Notably, the individual cost of the most commonly used resources may exceed $1000,1,32 and more heavily indebted students are less likely to choose less lucrative careers in fields such as primary care33 where physician shortages loom largest—a possible outcome that, again, does not improve public health.

Preparation for USMLE Step 1 has become the de facto preclinical educational curriculum.1 Regrettably, students’ intense focus on Step 1 preparation limits the willingness of medical schools to teach content that is not directly assessed on the test. This restrains innovation and meaningful experimentation in preclinical medical education and raises the question of whether medical educators are best preparing students to serve the public in a rapidly changing field.

The adverse effects of high-stakes test preparation on student well-being and mental health are well described.1,3,31 At most medical schools, students today spend more time in dedicated preparation for Step 1 than they do in any other single course, rotation, or clerkship.32 Much of this time is spent memorizing facts that are poorly retained, with significant decreases in recall after only a few years.34 It seems curious to us that, in an era when medical information is more readily accessible than ever before, we prioritize a test of memorized knowledge more than ever.

We wonder, then, why the NBME would tolerate the overextension and inappropriate use of its tests, especially when doing so may be compromising its mission.

NBME Finances

The NBME is classified as a nonprofit corporation under Section 501(c)(3) of the U.S. Internal Revenue Code. Though such organizations are tax exempt, they are required annually to report certain financial data on Internal Revenue Service (IRS) Form 990. This tax form is subject to public disclosure requirements, and documents for many nonprofit organizations are archived online. We obtained the financial data presented here from the NBME’s IRS Form 990 returns from 2001 to 2017 using ProPublica’s Nonprofit Explorer (

Unlike many nonprofit organizations, the NBME does not solicit or accept charitable donations. While the NBME does derive some income from investments, the vast majority of its revenue is generated by the programs it administers.

In 2017, the NBME reported a program service revenue of $153.9M. This amount reflects substantial growth from 2001, when the NBME’s program service revenue was $47.5M. Such a large increase is only partially explained by inflation. Using the Bureau of Labor Statistics consumer price index,35 $47.5M in 2001 would be equivalent to $66.9M in 2017. In fact, the increase in revenue achieved by the NBME outpaced growth in many for-profit corporations over the same period. For instance, from January 2001 to December 2017, the Dow Jones Industrial Average rose from 10,646 to 24,719,36 a factor of 2.3, while the NBME’s program service revenue increased by a factor of 3.2 (Figure 2).

Figure 2
Figure 2:
Program service revenue for the National Board of Medical Examiners, 2001–2017. The authors procured the financial data presented here from the NBME’s IRS Form 990 returns from 2001 to 2017 using ProPublica’s Nonprofit Explorer (online at

We researched why revenues at the NBME have risen so significantly. The primary programs administered by the NBME are the USMLE Step 1 and Step 2 exams. However, the pipeline of students taking these exams has been essentially unchanged since 2006. Registration fees have risen (from $435 in 2004 to $610 in 2018), but this increase alone is insufficient to explain the rise in revenue. While the NBME did add an additional mandatory examination (Step 2 Clinical Skills in 2004), much of the revenue growth in recent years has come from the direct sale of nonmandatory products and services to students or schools under the NBME’s Medical School and Student Services programs.

For instance, since 2003, the NBME has offered the Comprehensive Basic Science Self-Assessment (CBSSA), a practice test designed to prepare students for the Step 1 exam. NBME authors noted a strong correlation between scores on the CBSSA and USMLE Step 1,37 and as the importance of Step 1 scores has grown, students have purchased CBSSA exams with increasing frequency. According to the NBME’s annual reports,38,39 from 2005 to 2016, sales of CBSSA exams increased 10-fold, from 12,817 to 128,934. At a price of $60 per test, CBSSA sales accounted for $7.7M in revenue for the NBME in 2016.

Similarly, since 2007, the NBME has provided Customized Assessment Services (CAS). This program allows schools to create examinations using USMLE-style test items. By 2017, 83 U.S. and 16 international medical schools subscribed to CAS, and examinations were provided to over 102,000 examinees, more than double the number of examinees just 4 years before.40 (In response to student feedback requesting more USMLE-style exam content, our own institution is one such school.) Institutions pay $1,500 annually for a CAS subscription, plus a per-test fee based on the number of items on the test. Assuming that each of the 102,135 tests administered in 2017 was in the middle price range (likely a conservative estimate), then CAS generated $3.1M in revenue in 2017.

The increasing demand for these ancillary products and services, the administrative costs of which are lower than those associated with the USMLE, has significantly changed the NBME’s finances. In 2001, almost all of the NBME’s program service revenue came from examination fees associated with the USMLE Step 1 and Step 2 exams; however, by 2017, just 63% of the organization’s program service revenue came from the USMLE program. In fact, by 2017, the NBME derived more profit from its Medical School and Student Services programs ($15.4M) than from the USMLE program itself ($10.9M).

Nonprofit corporations cannot legally distribute to shareholders revenue received in excess of program costs. The NBME, therefore, reinvests its profits in new programs, organizational growth, and increased salaries for executives and other employees. Since the year 2001, the proportion of revenue directed toward employee compensation has been stable at around 30% of organizational expenses. However, because revenue has since increased substantially, the number of individuals receiving over $100,000 annually more than doubled over 10 years (from 93 in 2008 to 207 in 2017). Similarly, executive compensation accounts for approximately 4% of expenses, a figure that has been relatively stable over time. Therefore, in parallel with the tripling in the NBME’s revenue, the total compensation received by the organization’s chief executive officer increased from approximately $400,000 U.S. dollars in 2001 to $1.2M in 2016.

It is unknown—and unknowable—whether the NBME could have achieved such financial growth over the past 2 decades without the creation of the profitable side businesses included in its Medical School and Student Services programs. Yet it seems clear to us that Step 1 mania drives the market for such products and services and that a change to a pass/fail test would have significant financial implications for the organization.


Given that the NBME has a financial COI in determining USMLE score-reporting policy, we have considered means by which to ameliorate the current situation. Several changes could minimize the influence of secondary interests on USMLE policy: disclosure, recusal, divesture, and restructuring.

Disclosure has emerged as the primary means of addressing COI in academic medicine. In many cases, disclosure is adequate to resolve a conflict. For instance, a physician who believes that the conclusions of a pharmaceutical-company-sponsored clinical trial are biased can simply choose not to prescribe the studied medication. In the market space where the NBME operates, however, disclosure alone is inadequate. For graduates from MD-granting medical schools in the United States, the NBME has a monopoly on testing for licensure. There is no alternative.

Another option is recusal. Just as a judge might recuse herself from deliberations on a case in which she stands to gain financially from the outcome, the NBME could recuse itself from the deliberations on its score-reporting policy. However, during the recent review of score-reporting policy at the March 2019 Invitational Conference on USMLE Scoring, executives from the USMLE’s sponsor organizations led the deliberations, selected the meeting’s deliberants, and provided reimbursement to attendees for hotel and travel-related expenses.41 To ensure adequate insulation from the personal and professional influence of those with strong secondary interests, a truly independent group of public stakeholders might conduct such reviews in the future.

A third option is divestiture: the NBME could divest itself of some of its secondary interests. For instance, its Medical School and Student Services programs could be spun off into a separate corporate entity. It may be preferable for such an entity to be a for-profit corporation. Although the USMLE itself is a licensing exam that protects public health, it is less clear that sales of these ancillary services support a tax-exempt mission. Moreover, configuration as a for-profit corporation would allow these products and services to compete freely with UWorld, Kaplan, and other companies in the test-preparation space. Divesting these other non-USMLE services would resolve many of the NBME’s conflicts, but would not remove financial incentives associated with the USMLE program itself.

A final option would be for the NBME to restructure its finances. Since the organization takes in revenue in excess of its program expenses, it could return that money through its grant programs, rather than reinvesting profits in new revenue-generating programs and increased salaries. Such a policy could be administratively cumbersome and may have unintended consequences but would at least remove the secondary interest that arises from the organization’s current configuration.

In Sum

In tracing the history of the NBME and the rise of Step 1 mania, we have shown that the current structure of the NBME creates a COI: the organization’s primary interest—to protect the health of the public—may be at odds with secondary financial incentives. In noting this, we wish to emphasize that COI is a condition, not a behavior.42 Even if the NBME’s policies have not been influenced in the least by financial motives, the mere appearance that they may undermines stakeholders’ confidence in the organization. The NBME serves a valuable purpose and has worked diligently over decades to earn the public trust. Failing to mitigate or resolve the COI places that trusted position at risk.

COI can be addressed only after acknowledging that it exists. We call upon the NBME and other stakeholders to work together to acknowledge these conflicts and ensure that secondary interests do not compromise the organization’s critical primary mission.


Copyright © 2019 by the Association of American Medical Colleges