Secondary Logo

Journal Logo

Letters to the Editor

Toward Thoughtful Use of Shelf Exam Scores in Clerkship Assessment Systems

Mattson, Christopher MD; Park, Yoon Soo PhD

Author Information
doi: 10.1097/ACM.0000000000003603
  • Free

To the Editor:

We agree with Dr. Schilling’s suggestion1 to proceed with caution when considering increasing the weight of the National Board of Medical Examiners subject (shelf) exam to circumvent the conjunctive use of an honors-eligibility cut score. Applying excessive weight to the shelf exam threatens the validity of the assessment system due to construct underrepresentation.

That is, we could view the construct of interest in undergraduate clerkship education to be the characteristics of an entrustable professional. If we view clerkship grades as a form of tracking learner progress toward the 13 core entrustable professional activities (EPAs) for entering residency, we would contend that the shelf exam assesses at most 2 EPAs: “Prioritize a differential diagnosis following a clinical encounter” and “Recommend and interpret common diagnostic and screening tests.”2 Other forms of assessment targeting the remaining 11 EPAs are needed to ensure complete construct sampling. This is part of the reason why performance-based assessments, such as objective structured clinical examinations, and workplace-based assessments, including clinical performance evaluations, have value in this context.

Further, Dr. Schilling states that a combined conjunctive and compensatory scoring system leads to “unreasonable outcomes”1 that make the assessment system unreliable. Medical education studies have borne this out, showing that conjunctive scoring leads to lower reliability of assessment.3 This is consistent with findings from the psychometrics and measurement literature.4 In an assessment system charged with making high-stakes decisions, reliability is of paramount importance. Educators must continue to circle back to this foundational idea throughout development, implementation, and utilization of assessment systems.

Christopher Mattson, MD
Fourth-year resident, Department of Pediatrics—Medical Education Scholarship Track, University of Chicago Medical Center, Chicago, Illinois; [email protected]; ORCID:
Yoon Soo Park, PhD
Director of health professions education research, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts; ORCID:


1. Schilling DC. Using the clerkship shelf exam score as a qualification for an overall clerkship grade of honors: A valid practice or unfair to students? Acad Med. 2019;94:328–332.
2. Englander R, Flynn T, Call S, et al. Toward defining the foundation of the MD degree: Core entrustable professional activities for entering residency. Acad Med. 2016;91:1352–1358.
3. Onishi H, Park YS, Takayanagi R, Fujinuma Y. Combining scores based on compensatory and noncompensatory scoring rules to assess resident readiness for unsupervised practice: Implications from a national primary care certification examination in Japan. Acad Med. 2018;93(11 suppl):S45–S51.
4. Hambleton RK, Slater SC. Reliability of credentialing examinations and the impact of scoring models and standard-setting policies. Appl Meas Educ. 1997;10:19–38.
Copyright © 2020 by the Association of American Medical Colleges