Enhancement of the Assessment of Physician–Patient Communication Skills in the United States Medical Licensing Examination

Hoppe, Ruth B. MD; King, Ann M.; Mazor, Kathleen M. EdD; Furman, Gail E. PhD; Wick-Garcia, Penelope; Corcoran–Ponisciak, Heather PhD; Katsufrakis, Peter J. MD, MBA

doi: 10.1097/ACM.0b013e3182a7f75a


As part of a strategic planning initiative, the National Board of Medical Examiners (NBME) recently reviewed all components of the United States Medical Licensing Examination (USMLE). The USMLE, sponsored by the Federation of State Medical Boards and the NBME, is a three-step examination for medical licensure of U.S. physicians trained at MD-granting institutions here or abroad. One recommendation generated from this comprehensive review was to enhance the communication skills component of the Step 2 Clinical Skills (Step 2 CS) Examination.1 The Step 2 CS exam assesses communication skills, clinical problem-solving skills, and spoken English proficiency. It uses simulated medical encounters and standardized patients (SPs), who both portray the case and record the examinee’s performance. This Step 2 CS component, added to the USMLE sequence in 2004, was a manifestation of a long-standing commitment to assess performance of basic clinical skills.

To address the request for enhancements of the Step 2 CS exam, the NBME created a multidisciplinary team that comprised experts in communication content, communication measurement, and implementation of SP-based examinations. Four of our team members (A.K., G.F., P.W., and H.C.) are NBME employees and are routinely involved in processes involving the Step 2 CS exam (test development, SP training, measurement, and quality assurance activities). Therefore, we bring considerable working knowledge and experience to the tasks of review. Two external consultants (R.H. and K.M.) with backgrounds in physician–patient communication and measurement completed our team.

In 2007, we began by reviewing literature published since 2001 in physician–patient communication and patient-centered communication. We followed this with focus group discussions at several national meetings of communications skills teachers and researchers. We also held meetings with 14 teachers and scholars in the field of physician–patient communication to explore and update the concepts that had emerged in the literature review.

We also used parts of a survey of first-year postgraduate trainees.2 This paper-and-pencil survey, developed by the NBME as part of a review of the USMLE to identify new residents’ clinical responsibilities, was distributed to 8,793 first-year residents in September 2009, in the early months of their training. This yielded 2,523 surveys for analysis. The survey asked the residents about the frequency of and level of supervision for various clinical activities, including patient care, system-based practice, professionalism, and—of interest to us—communication skills.

Concurrently, we examined performance characteristics of the Step 2 CS exam, observed case development and quality assurance processes, interviewed SPs and their trainers, and reviewed over 150 video recordings, randomly selected, of examinee–SP interactions administered from 2006 through 2010. This review was exempt from institutional review board inspection because the examinees, when registering for the Step 2 CS exam, gave written permission for their recordings to be used, and the videos contained no individual examinee identification. We did not review videos of examinees who did not give permission (less than 1% of all examinees).

Following this multifaceted review, we explored options for enhancing each of the key elements of the assessment of communication skills within the Step 2 CS exam: the construct, the cases, the instrument, and the SPs. In this article, we share the perspectives we gained and outline the resulting enhancements, some of which were rolled out in June 2012 and others of which will be rolled out in the near future.

Our review produced observations in four categories: (1) defining what is to be assessed (construct), (2) producing the array of cases and examinee tasks (stimulus), (3) capturing the examinee’s performance (instrument), and (4) managing the SPs, who both portray the cases and record the examinee’s performance (SP management). These observations, in turn, stimulated recommendations for enhancements. We discuss the observations and recommendations for enhancements within each of these categories below.

Construct: What Is Being Assessed

Evolution of our understanding

Research strongly supports the importance of communication skills within medical encounters; a growing body of evidence links good communication to desirable clinical outcomes, including adherence, recall, some biologic markers (such as blood pressure), and increased illness-coping skills.3–7 As the importance of physician–patient communication has grown clearer, so too have calls for more patient-centered communication in medical encounters. In 2001 the Institute of Medicine recommended that care become responsive to patients’ needs and perspectives and that patients’ values guide decision making.8 Although definitions of patient-centered communication vary, there are core components: eliciting and understanding the patient’s perspective (concerns, ideas, expectations, needs, feelings, and functioning), understanding the patient within his or her unique psychosocial context, and reaching a shared understanding of the problem and its treatment that is concordant with the patient’s values.9

These components have stimulated the development of several conceptual communication frameworks, frameworks that significantly expand the communication tasks to be accomplished within medical encounters.10–14 Whereas physicians previously focused primarily on gathering information, they must now also build relationships, provide information, manage emotions, share the making of decisions, and encourage patients to change behaviors.

NBME findings

Data collected in the 2009 survey of first-year postgraduate trainees suggest that this expanded list of essential communication tasks applies to residents soon after they begin their training.2 In general, the survey findings indicated that, unlike the supervised communication they had practiced in medical school, new first-year postgraduate trainees frequently conducted a wide range of communication tasks without direct supervision, some of which they found particularly stressful and challenging, such as delivering bad news and managing hostile patients.

Our review of video recordings from the Step 2 CS exam revealed that examinees tended to focus on gathering information. With some clear exceptions, examinees used highly doctor-centered techniques, extracting medical data from patients with multiple, closed-ended questions, apparently neglecting relationship building and other more empathic and patient-centered behaviors. They rarely used patient-centered techniques to elicit the patient’s experience of illness, understand the patient as a whole person, or establish common ground.

One explanation for the highly interrogatory behavior may be that examinees focus on the other assessment construct embedded in the Step 2 CS exam, that of testing their diagnostic and clinical reasoning skills. Examinees may not adequately perceive or respond to the fact that these simulated encounters have two crucial, simultaneous, and mutually reinforcing purposes—that is, to assess the ability to both solve clinical problems, and to demonstrate understanding of and support for the patient.

Implications for assessment

In response to the literature review, the survey, and the video observations, we adopted a six-function model as the foundation for an enhanced construct (see Table 1). This model represents a synthesis and modification of the core elements of current dominant models.10,12 For each function, we identified behaviors that are both patient-centered and have evidence of effectiveness. By focusing on behaviors that seemed most relevant for first-year postgraduate trainees working under supervision, we divided the model into basic communication behaviors, included in the June 2012 rollout, and more advanced behaviors, which will emerge in later USMLE enhancements. Information about the basic behaviors is available in Table 1 and on the USMLE Web site.15

Stimulus: Case Development and Other Components

All cases are not alike

The assessment stimulus includes the case underlying the clinical encounter, the SP’s portrayal of that case, and the instructions provided to the examinee regarding his or her task or role. In our review of how cases are developed, we noted that biomedical detail dominated both the process and the resulting case. The implicit assumption seemed to be that cases that work well as clinical problem-solving stimuli also successfully stimulate communication tasks. Yet, although cases developed this way do contain some psychosocial and emotional details, they often lack enough detail to elicit the desired set of communication skills. For example, a case might describe a patient with seven children ranging in age from 6 months to 10 years. However, the effect of the child care burden on the patient’s presenting problem would typically not be specified in the case materials.

A related observation was that some cases lend themselves to the evaluation of numerous communication functions, whereas other cases do not. For example, a clinical scenario with many diagnostic possibilities and uncertain resolution (e.g., fatigue) may stimulate information gathering, but leave little time to provide information. From these observations, we concluded that the cases needed to systematically sample not only the medical content but also the communication content.

Balancing standardization and authenticity

Once the case is developed, SPs are trained to portray it during the examination. For high-stakes assessments, the portrayals must be sufficiently standardized so that each examinee faces similar challenges. But the video recordings we reviewed of Step 2 CS encounters suggested that too much focus on standardization can result in stilted SP behaviors and reduced realism. This tension is unavoidable, but striking the right balance between standardization and flexibility improves the case stimulus while ensuring a fair, consistent assessment.

To bring the sampling of communication tasks more in line with existing “blueprinting” principles for USMLE development, we recommended a more systematic approach to sampling a wider range of communication behaviors embedded within the construct. In this way, each case now subtly emphasizes different communication tasks. This does not mean one communication function per case, but it does imply that the test as a whole must generate stimuli for all key functions. This “blueprinting” conforms to principles of good test construction16 and should provide a richer stimulus for each communication function.

We also recommended enriching the psychosocial detail within the cases to vary the contexts for each case. Weiner and colleagues17 have noted that contextual clues to issues in the patient’s life narrative can influence the clinical presentation or management. Such clues require skilled and empathic pursuit by the clinician and, in a testing venue, may help distinguish between highly and less skilled examinees. Thus, SPs are now trained to vary their portrayals in the new Step 2 CS cases, but in a standardized manner, based on the examinee’s use or neglect of desired communication skills. SPs simultaneously maintain standardization with regard to the biomedical detail. And finally, to strengthen the link between the assessment of communication skills and the tasks actually required of first-year postgraduate trainees, case developers are encouraged to focus on tasks likely to be required in graduate medical education settings.

Clarifying expectations for examinees

For the assessment stimulus to succeed, examinees need to understand what role they are to assume within the simulated encounter. We observed that many examinees take limited responsibility for the patient, truncate the expository phase of their encounters by deferring to the attending physician, and end the encounter prematurely. This makes it difficult to assess certain communication skills, such as providing information and making shared decisions.

Although deferring to more advanced physicians might be appropriate behavior for medical students, residents report having to communicate often with patients before consulting with senior faculty.2 The exam instructions must therefore help examinees understand exactly what they are expected to do in the simulation; otherwise, they may not demonstrate skills that they actually possess. Toward this end, we generated more specific detail in various orientation materials given to examinees and also available on the USMLE Web site.15 For example, examinees are told to be “responsive to the patient’s needs” and to “not defer decision making to others.”


The results of an SP encounter are recorded by the SP on what is called the instrument. The instrument contains descriptions of specific behaviors that are intended to be measured in the encounter. The SP completes the instrument immediately following each encounter.

After reviewing the communication skills assessment instrument that SPs used to record examinees’ communication behaviors prior to June 2012, we determined that it, too, required modifications if it was to assess the communication skills defined in the enhanced model. Our goal was to ensure that each item on the assessment instrument could be directly linked to the six functions via the specific elements listed under each function (see Table 1). Another equally important goal was to ensure that all examinees’ performances could be captured consistently, accurately, and fairly, without the incursion of irrelevant factors such as gender, race, location of the examination, or an SP’s idiosyncrasies. This meant developing clear, objective criteria for determining, in a highly reproducible manner, whether examinees displayed each communication skill during their encounters. We defined communication skills dichotomously so that SPs, rather than having to make subjective judgments about the quality of an examinee’s communication skills, could simply document their occurrence or nonoccurrence. This is similar to how history and physical examination checklists are used in SP examinations. SPs are not expected to judge which clinical data should be sought; rather, the checklist is specific enough to observe which clinical data are actually sought (for example, “The examinee asked me if I was short of breath”).

Following this approach, we developed a new assessment instrument strongly linked to the six modified communication functions. We developed operational definitions of behaviors related to each function, with specific criteria, rules, and examples that would minimize the need for SPs to make subjective judgments. Defining recognizable signs of a complex behavior is challenging and involves a tension between the subjective and the specific. For example, “demonstrates respect for the patient” defines an important behavior, but would leave the SP to judge on his or her own which behaviors are respectful. On the other hand, “drapes the patient” operationalizes one way in which the examinee may demonstrate respect, and is easy for the SPs to record, but it runs the risk of rewarding a rote behavior rather than assessing whether the examinee is actually attending to the patient’s comfort and privacy.

We have put the new instrument through multiple cycles of development, testing, and revision. Video recordings of actual examinee behavior have allowed us to test whether the coding criteria are appropriate and sufficient. We have found that not all behaviors called for by experts can be operationalized sufficiently to make them observable; in such cases where SPs cannot accurately and consistently record a behavior, we have not included it in the assessment. SP-based examinations are limited in what they can measure; some communication concepts require the evaluative expertise of trained faculty.

SP Management

SP management begins with the selection of the SP, continues with their training, and includes the ongoing monitoring of their performance.


SPs have many responsibilities. First, they must portray key aspects of the case with exactitude, integrating verbal and nonverbal behaviors consistently over several encounters. They must accurately portray their case during the encounter and then recall and record a variety of behaviors at the end of the encounter. Those individuals responsible for recruiting and selecting SPs must ensure that they have sufficient portrayal, observational, and recall skills to perform these tasks. Where possible, memory aids such as instant video review need to be provided.


We knew that, before instituting a new communication assessment, all SPs would need to be retrained, a step just as important as developing the case or instrument. We therefore decided, after thoroughly reviewing the training and quality assurance processes associated with the Step 2 CS exam, to expand the SPs’ training to include self-directed learning that focused on the instrument’s new items. They found video examples of positive and negative communication behaviors particularly helpful. Throughout, we emphasized that they should observe examinees’ behaviors rather than judge their communication skills. At the end of the training, each SP was required to accurately recall a number of instrument items that equaled or exceeded a specified amount. If they were unable to achieve the target level, they were not permitted to participate in the exam.


Our team includes individuals who train and manage SPs for Step 2 CS exams across the five national sites. In their experience, despite substantial time devoted to training and monitoring SPs, some unwanted variability remained in SPs’ portrayals and ratings. To reduce this all-too-human problem of drift in SP ratings, we incorporated even more frequent reinforcements, linked to the ongoing, routine assessment of each SP’s ability to use the scale accurately. We will need to monitor the result of these reinforcements following the rollout of the enhancements, but, just as with other examinations where reliability is critical, ongoing quality assurance steps will likely help.

Implications and Conclusion

We have documented the evolution of the USMLE’s assessment of communication skills (see Table 2). As of June 2012, we have enhanced the assessment of basic communication skills and have begun to enhance the assessment of more advanced communication skills (see Table 1). We are mindful that, as core competencies in the field of physician–patient communications evolve, further refinement of these assessments will be required. We believe that any system of high-stakes assessment should be periodically reviewed. Further, many of the enhancements noted above apply to lower-stakes examinations and, as such, could serve as a guide for intramural assessments of communication skills.

Our review, beyond leading to enhancements to the USMLE Step 2 CS exam, also has implications for medical education. We have speculated that trainees overly focus on data gathering and problem solving. Why is this? Perhaps their documented drop-off in empathy and communication skills is due to role modeling; they may be copying what they see during their clerkships.18,19 If this is true, a mismatch exists between the currently recommended assessment construct for communication skills and the construct inferred by examinees from their clinical experiences.

This article has not addressed the all-important area of standard setting and scoring. Our goal for standard setting and scoring, as always, will be for the test to lead to the “right” decision: Examinees who are at least minimally competent pass, whereas those who are not do not. But considering this issue in the context of the described advances in assessing communication skills raises a series of important questions. What do we mean by “minimally competent”? Where to set the bar? Are the physician–patient communication skills called for by communication experts and patient groups different from the communication-related performances deemed minimally acceptable by medical educators? How should such differences influence the standards for the communication component of the Step 2 CS exam? What impact might a performance score rather than a pass/fail determination have? And finally, how can the NBME best provide feedback to educators and examinees to enhance the teaching of communication skills? These and many related questions will need to be addressed.

Many experts have noted an evolution over the past decade or two in the nature of the physician–patient relationship toward a patient-centered model of more mutual information exchange and shared decision making. Patients have expressed the need for more information and greater participation in the decision-making process.20,21 It could be that our training of communication skills has not sufficiently incorporated or reinforced a truly patient-centered approach to communication.

In 2011, Levinson and Pizzo22 stated that “medical education at the student and residency levels requires major efforts to increase the teaching of communication skills.” We agree. We believe that more attention to teaching communication skills is needed throughout medical education, including graduate training. And, given the relationship between instruction and assessment, we also need intra- and extramural systems that can provide high-quality feedback and generate rigorous, up-to-date, high-stakes assessment at key milestones. This effort will certainly require better curricula and better assessments. It will also require that researchers, educators, and assessment experts together recognize and respond to the fact that communicating with patients is much more complex, difficult, and important than initially thought.23 Collectively, we should produce newly licensed physicians who are solidly competent to communicate in ways that patients desire, expect, and deserve.


© 2013 by the Association of American Medical Colleges