Share this article on:

Perspective: Anticipating the Challenges of Reforming the United States Medical Licensing Examination

McMahon, Graham T. MD, MMSc; Tallia, Alfred F. MD, MPH

doi: 10.1097/ACM.0b013e3181ccbea8
High-Stakes Examinations

The practice of medicine is a shared social contract between the medical profession and the public. Assessments for licensure should reflect competencies that patients expect of their physicians and should be patient-centered and mirror the progressive nature of medical education. The National Board of Medical Examiners recently accepted the recommendations of the Committee to Review the United States Medical Licensing Examination Program to align the examination sequence with two patient-centered decision points: when a student enters into supervised graduate training, and when a physician receives initial licensure for unsupervised practice. The revised examination program would aim to evaluate for the presence of at least minimum proficiency in all competencies that are measurable in a valid, reliable manner at each decision point, including the scientific foundation of medical practice, the application of medical knowledge to patient care, and the clinical skills relevant to practice level, whether measured by standardized patient-based assessments or other formats. Students, educators, educational leaders, and program directors have raised legitimate concerns about the anticipated changes. The anticipated costs, the changes' effect on basic science education, their impact on dual-degree candidates and international medical graduates, and the utility of score reporting are each of concern. Anticipated benefits include a closer alignment of assessments with the expectations of patients and licensing authorities, closer integration of the sciences fundamental to medical practice throughout the examination sequence, and an increased breadth of competency assessment. The authors believe that the benefits to patients and the profession will outweigh the acknowledged challenges the changes will pose to medical education.

Dr. Tallia is professor and chair, Department of Family Medicine and Community Health, University of Medicine and Dentistry of New Jersey, Robert Wood Johnson Medical School, New Brunswick, New Jersey.

Dr. McMahon is assistant professor of medicine, Harvard Medical School, Division of Endocrinology, Diabetes and Hypertension, Brigham and Women's Hospital, Boston, Massachusetts.

Correspondence should be addressed to Dr. Tallia, Department of Family Medicine and Community Health, University of Medicine and Dentistry of New Jersey, Robert Wood Johnson Medical School, 1 RWJ Place MEB 288, New Brunswick, NJ 08903; telephone: (732) 235-6029; fax: (732) 246-8084; e-mail:

The practice of medicine represents a shared social contract between the profession and the public. If the public's trust in our profession's ability to self-regulate is to be maintained, society's priorities should be reflected in the assessments used to determine readiness to practice, and it is the profession's responsibility to ensure that these assessments are appropriate and fair. Ultimately, the medical profession and licensing authorities must be able to provide assurances to the public that physicians are competent to deliver the services the profession purports to provide.

Back to Top | Article Outline


Since 1992, the United States Medical Licensing Examination (USMLE) Steps have served as a common pathway for licensure for medical schools accredited by the Liaison Committee on Medical Education and for international medical graduates entering medical training in the United States. Computer-based test administration was initiated in 1999, and the Clinical Skills component of the Step 2 examination (Step 2 CS) was introduced in 2004. The overall three-step assessment framework of basic science, clinical knowledge, and clinical practice, has, however, remained unchanged.

In contrast, relentless advances in knowledge have clinical science and practice in a constant state of change. Clinicians are expected to remain current and draw on these new fundamental insights in their approach to patients.1,2 As scientific controversies become increasingly public, graduating medical students will need to have greater skill with information retrieval and interpretation than ever before.

The public's expectation of physicians is also changing, and competence has come to mean a great deal more than knowledge alone. When serious problems emerge in clinical practice, difficulties involving physicians' communication skills and professionalism are often either causative or contributive.3,4 Patients recognize the importance of these skills and assume that licensure implies their physicians have had these skills rigorously assessed. Many educational leaders and organizations acknowledge the importance of these and other competencies to the profession but recognize that such competencies are more difficult to assess using standard tools.5 The USMLE program has lagged behind medical education and clinical practice in assessment of these broader competencies.

The current examination sequence has been criticized for constraining curricular innovation. Many residency programs have come to use Step 1 scores as a normative national measurement for candidate selection—a mismatch between use and intent.6 Consequently, many curricula segregate basic science knowledge in the first part of medical school, with minimal integration of this material into later years. Retention of basic science knowledge suffers, and this structural obstacle makes curricular experimentation difficult.7 Additionally, many students resent the apparent redundancy among the various school-based examinations and those required for licensure. Examinees are often the last to be consulted, yet they feel the greatest impact of changes in assessment strategies, costs, and timing.8 Broadening the scope of the examination sequence to be more patient-centered would be more straightforward if examinees weren't increasingly burdened by the high costs of education or if the costs could be shifted away from examinees.

Back to Top | Article Outline

Recommendations for Change

The Committee to Evaluate the USMLE Program (CEUP) was constituted by the USMLE Composite Committee in 2006 and comprised students, residents, clinicians, and members of the licensing, graduate, and undergraduate education communities.9 The goal of the committee was to review the results of an information-gathering process, determine whether the mission and purpose of USMLE were effectively and efficiently supported by the current design, structure, and format, and develop a series of recommendations for change.10 That committee recently issued its report, which was received and endorsed by the USMLE Composite Committee, the National Board of Medical Examiners (NBME), and the Federation of State Medical Boards (FSMB).

The report's recommendations, if implemented, would herald a substantial change in the organization of the assessment pathway for licensure. The committee's report calls for two patient-centered decision points, or gateways, with the first gateway sequence of assessments occurring as students prepare to enter supervised practice (i.e., residency), and the second as a physician prepares for full licensure and unsupervised independent practice. The report calls for an expansion in the breadth of the examination to include assessment of a broader array of physician competencies (acknowledging that valid and reliable new tools will need to be developed), reinforces that depth of scientific insight is necessary for the competent practice of medicine, and seeks to recognize the importance of skills in information retrieval and interpretation. The report stresses the importance of clinical skills and calls for an expansion of their assessment. CEUP recognized the considerable logistic and financial burdens associated with additional standardized patient examinations; more liberal use of audiovisual and other technologies is envisaged. The report does not specify the number or timing of the individual assessment components that will comprise each gateway. This will be the work of subsequent staff and volunteer design task forces.

Back to Top | Article Outline

Responses to the Proposed Changes

There has been considerable divergence in the responses to the proposed changes.11,12 Some contend that there is little evidence that change is needed, but some maintain that change is long overdue.11,12 Others have suggested that skills such as professionalism and communications should remain within the sole purview of medical schools and residency programs.13,14

The incorporation of the fundamentals of science into examination sequences leading to the two decision points, the first toward the end of medical school, is a concern to both basic science and clinical educators. Many misconceive that the first decision point is addressed by a single examination combining materials from Step 1 and Step 2 CK. Rather, a new examination sequence is envisioned that will contain content that not only tests basic and clinical science knowledge directly but also tests elements that assess the ability to integrate and apply these knowledge sets.

The current Step 1 examination primarily tests the acquisition and application of basic science knowledge, and this knowledge declines rapidly over time.7 The use of Step 1 scores for residency selection motivates students to engage with the material.6 Some basic science educators worry that the absence of a high-stakes examination that focuses on their content area could reduce the engagement of their students or, worse, result in their departments or divisions being devalued and undermined, with resulting implications for staffing and revenue.11 These concerns are mitigated by recognizing the opportunities that the evolution of the USMLE will generate for basic science educators. The proposed changes emphasize incorporating content from the sciences fundamental to the practice of medicine in both gateway sequences. Because retention is enhanced by repetition,15 the demand for basic science education throughout the curriculum may actually increase. If needed for student or program evaluation, the “comprehensive basic science examination,” already offered by the NBME, can provide a valid and reliable assessment tool to flexibly assess the acquisition of basic science knowledge at any point in the curriculum.16

Candidates pursuing dual or advanced degrees could be disadvantaged by the gateway decision points if their curricula continue to separate basic and more clinical education in the traditional manner. However, a relative advantage could accrue if their curricula reinforce and develop students' critical thinking and integration skills.

Because the second gateway sequence will likely contain elements that assess fundamental scientific concepts and knowledge, residency program directors are concerned about how these materials can be introduced into a curriculum that is already packed with required clinical content. Though there has been extensive and encouraging experience with early integration of clinical activities into the medical school curriculum, much is yet to be understood about how to engineer reintegration of basic sciences into graduate education. To overcome this obstacle, programs that have more limited affiliations with a major university may need to capitalize on available staff within their teaching hospitals in departments and divisions such as pharmacy and psychology with whom they may have had limited curricular engagement in the past. University-affiliated programs will need to reach out to their respective medical schools to reestablish a continuum of medical education across the undergraduate-to-graduate divide.

A more complete range of competencies, such as those promulgated by the Accreditation Council for Graduate Medical Education,17 will expand the assessment of candidates beyond knowledge alone to include the skills, attitudes, and judgments essential to effective patient care. Competencies such as communication skills and professionalism require some knowledge but are largely unsuitable for testing in a multiple-choice examination. The USMLE Step 2 CS already assesses elements of communication and interpersonal behavior. The committee anticipates that candidates will be challenged by more complex and provocative encounters in the gateway examinations to more carefully assess communication skills using enhancements to the current format. In contrast, the USMLE is not well placed to reliably assess professionalism or its elements such as reliability, honesty, integrity, maturity, respect for others, altruism, and absence of impairment.14 In particular, the challenges to professionalism such as conflict of interest, abuse of power, lack of conscientiousness, and destructive arrogance are unsuitable for a periodic high-stakes examination and are more appropriately examined continuously or at the time of a critical incident. To date, attempts to assess ethical decision making in simulated clinical encounters have had low reliability and case-specificity.18,19 Consequently, the reliable assessment of professionalism will likely necessitate the creation of a partnership between the USMLE and medical schools, hospitals, and state medical boards. The USMLE could develop and provide standard reporting and evaluation schemas, which could then be collected from the school or board and reflected in the USMLE examination report.

Back to Top | Article Outline

Secondary Functions of the Examination

Although the USMLE is first and foremost a licensure examination, CEUP recognized that the USMLE has important and valid secondary functions, including the use of its scores for applicant ranking. Students whose life circumstances interfere with their ability to perform at their best could be particularly disadvantaged if a single-scored gateway examination toward the end of medical school replaced the current Step 1, Step 2 Clinical Knowledge, and Step 2 CS examinations. However, the concept of a single gateway, as outlined by CEUP, does not imply a single testing occasion; students who perform suboptimally should be able to recover between tests.

Medical students in the United States generally express a preference for pass–fail score reporting, and avoidance of comparative grading has been associated with reduced stress and burnout in medical school.20 However, a valid and reliable score on a high-stakes summative assessment has important utility for many students and program directors.6 A score allows students to demonstrate their proficiency (or lack of proficiency) in the tested domain. This is a particularly important attribute if the student is foreign-trained or an osteopathic graduate, where a good score may allow the student to be more competitive than the reputation of his or her school might otherwise allow.21 The CEUP report makes no recommendation on score reporting for the gateway examination sequences, acknowledging the significant diversity of opinion on this issue and recognizing that further study is needed. Other committees at the NBME and the FSMB will explore appropriate scoring paradigms over the course of the next several years.

Though the Step scores currently provide a national comparative performance metric for medical students, excessive dependence on a basic science test score for selecting residency applicants seems inappropriate. The current Step 1 score can provide a reliable estimate of basic science knowledge acquisition of the applicant; however, the score has little predictive power for discipline-specific knowledge as measured by residency-based in-training examinations, and it provides no metric of the other competencies.22,23 If multiple competency-specific score metrics were provided by the new gateway assessments, differential weighting could be used by different disciplines for a more nuanced approach to resident selection. Further development of the medical school performance evaluation, and standardization of deans' letters may provide a more appropriate comparative tool for selecting the most competitive applicants.24

Back to Top | Article Outline

The Need for Ongoing Discussion

All parties, including the parent organizations of the USMLE, recognize the financial burdens already facing medical graduates. CEUP acknowledged that any new or additional assessment tools implied by the recommendations must be rigorous and should respect the balance between cost and value to the examinee and licensing authorities. In particular, the Step sequences are especially costly for international medical graduates who must also travel for the clinical skills exam; these applicants, who make up a rising proportion of U.S. physicians, may be inordinately burdened by additional costs.25 Yet, the use of audio and video clips are accessible technologies and may be built into examination sequences without the need for large increases in examination costs in methodologies that can assess certain competencies such as cultural competence and communication skills.

As real and important as many of the expressed concerns about the proposed USMLE changes may be, we believe the benefits to patients and the profession derived from the recommendations far outweigh the challenges that the proposed changes will bring to medical education and test development. We believe that a redesigned examination series will allow the profession to realign its social contract with patients by explicitly promoting excellence in a broader array of competencies. Because assessment often drives learning and curricular innovation, the new examination may increase the attention given to professionalism, communication, self-audit, evidence-based practice, and resource utilization. We hope that our students will move beyond rote memorization to a learning strategy that facilitates and rewards reflection, critical thinking, and continuous self-improvement. Increased familiarity and flexibility with the fundamental sciences should enhance the ability of our future graduates to be competent problem-solvers for the next generation, preserving and strengthening lifelong learning in those sciences that are fundamental to the practice of medicine. Further integration of undergraduate and graduate education could reestablish a continuum of medical education, nurture collaborative efforts, and enhance retention through repetition. Medical school curricula would have greater flexibility because a single, high-stakes, midpoint assessment would have been removed. Full implementation of the complete set of recommendations is likely to be years away, and the medical education community should be a partner in the creation of new assessments to measure the full range of competencies essential to the practice of medicine.

Few worthwhile initiatives are easy, but the community of clinicians, academics, and their representative organizations will need to be convinced that change will be worthwhile for the initiative to be successful. Change may be liberating to some and constraining to others; only through active ongoing discussion and engagement will consensus emerge. The social contract between the public and profession deserves no less.

Back to Top | Article Outline



Back to Top | Article Outline

Other disclosures:

Drs. McMahon and Tallia served, respectively, as member and chair of CEUP.

Back to Top | Article Outline

Ethical approval:

Not applicable.

Back to Top | Article Outline


The opinions expressed are the authors' alone and do not necessarily represent those of the United States Medical Licensing Examination, National Board of Medical Examiners, Federation of State Medical Boards, or Education Commission for Foreign Medical Graduates.

Back to Top | Article Outline


1Glasziou P, Haynes B. The paths from research to improved health outcomes. ACP J Club. March–April 2005;142:A8–A10.
2Hunt RE, Newman RG. Medical knowledge overload: A disturbing trend for physicians. Health Care Manage Rev. Winter. 1997;22:70–75.
3Papadakis MA, Teherani A, Banach MA, et al. Disciplinary action by medical boards and prior behavior in medical school. N Engl J Med. 2005;353:2673–2682.
4Papadakis MA, Arnold GK, Blank LL, Holmboe ES, Lipner RS. Performance during internal medicine residency training and subsequent disciplinary action by state licensing boards. Ann Intern Med. 2008;148:869–876.
5Leach DC. The future of accreditation of medical education. Am J Med. 2004;116:859–861.
6Green M, Jones P, Thomas JX Jr. Selection criteria for residency: Results of a national program directors survey. Acad Med. 2009;84:362–367.
7Ling Y, Swanson DB, Holtzman K, Bucak SD. Retention of basic science information by senior medical students. Acad Med. 2008;83(10 suppl):S82–S85.
8Papadakis MA. The Step 2 clinical-skills examination. N Engl J Med. 2004;350:1703–1705.
9Scoles PV. Comprehensive review of the USMLE. Adv Physiol Educ. 2008;32:109–110.
10Committee to Evaluate the USMLE Program. Comprehensive review of USMLE. Available at: Accessed November 20, 2009.
11Baptista N. Alliance groups respond to proposed changes to USMLE. Acad Intern Med Insight. 2008;6(2):10.
12Battinelli DL, Smith L. Correspondence: “Alliance groups respond to proposed changes to USMLE.” Acad Intern Med Insight. 2008;6(4):5.
13Hawkins RE, Katsufrakis PJ, Holtman MC, Clauser BE. Assessment of medical professionalism: Who, what, when, where, how, and … why? Med Teach. 2009;31:385–398.
14Arnold L. Assessing professional behavior: Yesterday, today, and tomorrow. Acad Med. 2002;77:502–515.
15Linton M. Real world memory after six years: An in vivo study of very long term memory. In: Gruneberg M, Morris P, Sykes R, eds. Practical Aspects of Memory. London, UK: Academic Press; 1978:69–76.
16National Board of Medical Examiners. Comprehensive Basic Science Examination. Available at: Accessed November 20, 2009.
17Accreditation Council for Graduate Medical Education. Common program requirements: General competencies. Available at: Accessed November 20, 2009.
18Prislin MD, Lie D, Shapiro J, Boker J, Radecki S. Using standardized patients to assess medical students' professionalism. Acad Med. 2001;76(10 suppl):S90–S92.
19Singer PA, Robb A, Cohen R, Norman G, Turnbull J. Performance-based assessment of clinical ethics using an objective structured clinical examination. Acad Med. 1996;71:495–498.
20Rohe DE, Barrier PA, Clark MM, Cook DA, Vickers KS, Decker PA. The benefits of pass–fail grading on stress, mood, and group cohesion in medical students. Mayo Clin Proc. 2006;81:1443–1448.
21Edmond MB, Deschenes JL, Eckler M, Wenzel RP. Racial bias in using USMLE Step 1 scores to grant internal medicine residency interviews. Acad Med. 2001;76:1253–1256.
22Perez JA Jr, Greer S. Correlation of United States Medical Licensing Examination and Internal Medicine In-Training Examination performance. Adv Health Sci Educ Theory Pract. 2009;14:753–758.
23Thundiyil JG, Modica RF, Silvestri S, Papa L. Do United States Medical Licensing Examination (USMLE) scores predict in-training test performance for emergency medicine residents? J Emerg Med. October 22, 2008. [Epub ahead of print.]
24Shea JA, O'Grady E, Morrison G, Wagner BR, Morris JB. Medical Student Performance Evaluations in 2005: An improvement over the former dean's letter? Acad Med. 2008;83:284–291.
25Brotherton SE, Etzel SI. Graduate medical education, 2007–2008. JAMA. 2008;300:1228–1243.
© 2010 Association of American Medical Colleges