A physician’s communication and interpersonal skills, including the ability to gather information effectively, counsel appropriately, and establish caring relationships with patients, are an integral part of overall clinical competence. Patients are more likely to share pertinent information and adhere to treatment when they feel that their doctors engage them in a dialog regarding their health care options, explicitly address their expectations and concerns, and treat them with courtesy and respect.1–3 Conversely, patients of physicians who fail to communicate effectively are more likely to lodge formal complaints, initiate malpractice claims, and be less satisfied with their health care experiences.4,5
Given the importance of communication and interpersonal skills for effective patient–physician relationships and the variability of medical training worldwide, obtaining a defensible measure of an international physician’s clinical skills is considered an important part of ensuring competency to practice medicine in an American medical setting. The Educational Commission for Foreign Medical Graduates is responsible for certifying graduates of international medical schools (IMGs) who wish to pursue graduate medical education in the United States. One requirement for certification is a passing score on the United States Medical Licensing Examination (USMLE) Step 2 Clinical Skills (CS) exam. The CS is a performance-based assessment where standardized patients (SPs) provide independent evaluations of a physician examinee’s communication and interpersonal skills (CIS). Obtaining a meaningful measure of communication skills is especially important because IMGs come from a wide variety of cultural backgrounds and have variable amounts of experience with patient care. Also, because of the lack of uniformity in global medical teaching and clinical care, IMGs, at least until they are fully acculturated into the American medical system, may interact with patients in ways that are not consistent with the norm. Whereas past research has examined IMGs’ communication and interpersonal skills as part of the former Clinical Skills Assessment,6–8 and overall performance in the currently required CS exam,9,10 this study focused on investigating IMGs’ performance on subcomponents of this measure. In addition, evidence to support the validity of the ratings was gathered by exploring differences in skills by examinee demographic variables.
The purpose of this investigation is to provide evidence to support the use of CIS ratings provided by SPs as part of the USMLE Step 2 CS assessment.
USMLE Step 2 CS
As part of the CS, physicians interact with 12 SPs as they would with actual patients, gathering and sharing relevant information, performing a focused physical examination, and writing up their findings in clinical notes. Examinees have 15 minutes to assess each of the 12 SPs, and they have 10 minutes after encounters to write up their findings. The exam is divided into three conjunctive scoring components, the integrated clinical encounter (ICE), CIS, and spoken English proficiency (SEP). The ICE portion comprises data-gathering skills (history taking, physical examination), scored by SPs who complete case-specific checklists, and the written summarization of patient findings, scored globally by trained physician raters. The CIS and SEP components are scored globally by the SPs. To pass the CS, examinees are required to achieve passing scores on the ICE, CIS, and SEP portions of the exam in the same administration.11 Relationships between these various test components have been described elsewhere9,10
As part of the CS, examinees are instructed to establish rapport with the patients, elicit pertinent historical information from them, perform focused physical examinations, answer questions, and provide counseling when appropriate. The SPs provide ratings of examinees communication and interpersonal skills along three dimensions: questioning skills, information-sharing skills and professional manner and rapport. These ratings are provided on a scale from 1 (needs significant improvement) to 9 (very good).
The first CIS dimension, questioning skills, includes assessment of various abilities, such as an examinee’s use of open-ended questions, transitional statements, facilitating remarks, and summarization, and avoidance of leading or multiple questions, medical terms/jargon unless immediately defined, and interruptions when the patient is talking.
For the second dimension, examples of information-sharing skills include an evaluation of examinees’ ability to acknowledge patient issues and concerns and clearly respond with information. Examinees are also evaluated on their ability to provide counseling when appropriate and to close the encounter with statements about what happens next.
The final subcomponent, professional manner and rapport, includes criteria such as asking about the patient’s expectations, feelings, concerns, support systems, and impact of illness, with attempts to explore these areas. Other examples of rated behavior include the examinee’s ability to provide opportunity for the patient to express feelings and concerns, encourage additional questions or discussion, make empathetic remarks, and show consideration for patient comfort during the physical examination.11
None of the three CIS dimension ratings, taken separately, are considered to be sufficient for high-stakes decisions about individual examinees or about the quality of educational programs; the overall CIS score, which is used for pass/fail decisions, is based on a combination of these ratings. However, the individual dimension ratings do allow for a very broad examination of general trends and relationships within the subject examinee population.
In 2006, 12,863 IMGs took the CS for the first time. English as a native language was reported by 25.7% of these examinees, and 41.6% were female. Approximately 20% were U.S. citizens at entry to medical school. The mean age of first-time test takers was 30.8 (SD = 6.4) years. More than 400 individual SPs provided CIS ratings during the study period, of which 55% were female and 45% were male. In total, the scores from 154,266 simulated clinical encounters were analyzed.
Descriptive statistics were used to summarize performance on the three CIS dimensions, and overall. To contrast the performance of select examinee cohorts, mean CIS scores, by examinee characteristics (gender and native language), were calculated. On the basis of past research,12 we hypothesized that female physicians and native English speakers would demonstrate greater CIS abilities. An analysis of covariance (ANCOVA) was used to detect a potential interaction between the gender of the SP and the gender of the examinee. To control for potential differences in examinee ability, SEP and data-gathering scores were used as covariates.
Correlation coefficients were calculated to determine the strength of the linear relationship between the separate CIS dimensions and two other CS exam components: SEP ratings and data-gathering checklist scores. Finally, examinee age was correlated with the three CIS dimensions, and overall.
The mean CIS ratings by dimension, gender, and native language are shown in Table 1. International medical graduate examinees received an average CIS score, across dimensions, of 6.37. They had the most difficulty with the information-sharing skills item, receiving an average rating of 6.26 (SD = 0.59). The ANCOVA did not reveal a significant interaction between examinee and SP gender (F3,154256 = 1.68; P = .17). The lack of a significant interaction indicates that CIS ratings do not vary as a function of both SP and examinee gender. Female examinees (mean total CIS = 6.46; SD = 0.43), on average, outperformed males (mean total CIS = 6.30; SD = 0.48). This difference was statistically significant (F1,154256 = 5.28; P = .005). The effect size for this difference was 0.35, representing just more than one third of a standard deviation. There was no significant difference in mean CIS ratings provided by male versus female SPs (F1,154256 = 1.46; P = .23). Male SPs provided average CIS ratings of 6.59 (SD = 0.91), and female SPs provided average ratings of 6.56 (SD = 0.94). Examinees who reported a native language of English (mean = 6.55; SD = 0.44) received slightly higher overall CIS scores than did examinees with other language backgrounds (mean = 6.30; SD = 0.45) (F1,12854 = 732.59; P < .001).
There were moderate correlations amongst the three CIS items, with items 1 (questioning skills) and 3 (professional manner and rapport) being most highly related (r = 0.71), indicating 50% shared variance between these two constructs. Item 1 and item 2 (information-sharing skills) were somewhat less related, although the correlation was still moderate (r = 0.61). Items 2 and 3 were correlated at the (r = 0.70) level.
The correlations of the CIS dimension scores with other measures are shown in Table 2. Spoken English proficiency correlated modestly with the overall CIS scores (r = 0.48). Data gathering was moderately correlated with the CIS total score (r = 0.45). The CIS information-gathering skills dimension had the highest correlation with data gathering (r = 0.51). According to the correlations in Table 2, older examinees tend to receive lower CIS ratings from the SPs on all three dimensions.
A physician’s communication and interpersonal skills can have a notable impact on the overall effectiveness of a medical encounter.1–3 Assessing IMGs’ ability to demonstrate appropriate skills in this domain is essential for ensuring appropriate patient care. The results of this investigation indicate that well-trained and monitored SPs, as part of a standardized examination, can provide meaningful evaluations of IMGs’ communication and interpersonal skills.
Overall, IMGs taking the CS for the first time performed reasonably well on the CIS measure. The mean score for each of the three dimensions was on the high end of the scale (greater than six on a nine-point scale), suggesting that, on average, IMGs are able to interact with the SPs appropriately. As expected, female examinees outperformed males. This finding is collaborated in studies of actual doctor–patient relations, which found that female physicians engage in more patient-centered communication than do males.13 The nonsignificant interaction between SP and examinee gender provides some evidence that SPs are providing unbiased ratings. On the basis of average CIS scores, male and female SPs provided comparable ratings. Because CS sessions may not have exactly equivalent ratios of male and female SPs, it is important that ratings do not vary appreciably as a function of SP characteristics.
Examinees whose native language was English received CIS ratings that were, on average, somewhat higher than those whose native language was not English. This finding was expected, given the relationship between language proficiency and communication and interpersonal abilities, at least for some of the interdependent and overlapping criteria used when measuring verbal behaviors. For example, it could be challenging for a nonproficient examinee to gather or share sensitive or personal information with a patient in a tactful and reassuring manner.
The correlations among the three CIS dimensions suggest that the interpersonal behaviors of physicians are related. A physician with excellent questioning skills (item 1) would also be expected to exhibit appropriate professional manner and rapport (item 3), which includes asking about the patient’s expectations, feelings, concerns, support systems, and impact of illness. Nevertheless, each SP did provide all three ratings for each examinee within an encounter, increasing the potential of halo effects.
Data-gathering scores were most highly correlated with the first CIS dimension, questioning skills. Logically, a physician who is able to effectively gather information from a patient would achieve a higher case checklist score. For example, a physician who summarizes the information gathered from a patient (a skill assessed in item 1) would be more likely to realize pertinent information that he or she had missed in the initial interview, ask the questions after the summary, and receive appropriate credit for the items on the checklist. It is also reasonable to hypothesize that a physician with poor communication abilities would have difficulty collecting information from a patient. We also found that older physicians tended to receive lower CIS ratings. These individuals may have graduated from medical school less recently, had fewer recent clinical experiences, or practiced for many years in a country with a very different style of patient–physician communication. Further research is necessary to better understand findings related to the communication skills of older physicians.
In 2006, more than 12,000 IMGs wishing to pursue graduate medical education in the United States took the CS. Results of this investigation provide initial evidence that the communication and interpersonal skills of this cohort of examinees were validly measured as part of the standardized examination.14,15 However, this study had certain limitations which may affect the interpretation of the findings. First, because of the simulated nature of the exam, these results may not allow for direct association to the real-world health care environment. Second, the analyses were conducted at an aggregate level, which could obscure potential differences of individual SP rating behaviors and variations in examinee performance. Finally, because the analyses were based on a heterogeneous cohort of IMGs, including graduates from more than 900 medical schools, the validity generalizations are limited. Further studies focused on more detailed psychometric analysis of the scale, additional examinee pools, and other criterion measures, including performance with real patients and other health care practitioners, are still needed.
1Flocke SA, Miller WL, Crabtree BF. Relationships between physician practice style, patient satisfaction, and attributes of primary care. J Fam Pract. 2002;51:835–840.
2Ong LM, de Haes JC, Hoos AM, Lammes FB. Doctor–patient communication: a review of the literature. Soc Sci Med. 1995;40:903–918.
3Stewart M, Brown JB, Donner A, et al. The impact of patient-centered care on outcomes. J Fam Pract. 2000;49:796–804.
4Levinson W, Roter DL, Mullooly JP, Dull VT, Frankel RM. Physician–patient communication. The relationship with malpractice claims among primary care physicians and surgeons. JAMA. 1997;277:553–559.
5Wofford MM, Wofford JL, Bothra J, Kendrick SB, Smith A, Lichstein PR. Patient complaints about physician behaviors: a qualitative study. Acad Med. 2004;79: 134–138.
6Boulet JR, Ben David MF, Ziv A, et al. Using standardized patients to assess the interpersonal skills of physicians. Acad Med. 1998;73(10 suppl):S94–S96.
7Ziv A, Boulet J, Friedman Ben-David M, et al. A holistic and behaviorally anchored measure of physician communication skills: issues of rater consistency. In: Proceedings of the Eighth Ottawa Conference on Medical Education and Assessment; Philadelphia, Penn; 2000:247–253.
8Chambers KA, Boulet JR, Furman GE. Are interpersonal skills ratings influenced by gender in a clinical skills assessment using standardized patients? Adv Health Sci Educ Theory Pract. 2001;6:231–241.
9Harik P, Clauser BE, Grabovsky I, Margolis MJ, Dillon GF, Boulet JR. Relationships among subcomponents of the USMLE Step 2 Clinical Skills examination, the Step 1, and the Step 2 Clinical Knowledge examinations. Acad Med. 2006;81(10 suppl):S21–S24.
10Clauser BE, Harik P, Margolis MJ. A multivariate generalizability analysis of data from a performance assessment of physicians’ clinical skills. J Educ Meas. 2006;43:173–191.
11Federation of State Medical Boards, Inc. 2006 Bulletin of Information. Philadelphia, Penn: National Board of Medical Examiners; 2005:1–42.
12van Zanten M, Boulet JR, McKinley DW. Correlates of performance of the ECFMG Clinical Skills Assessment: influences of candidate characteristics on performance. Acad Med. 2003;78(10 suppl):S72–S74.
13Roter DL, Hall JA, Aoki Y. Physician gender effects in medical communication: a meta-analytic review. JAMA. 2002;288:756–764.
14American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association; 1999.
15Messick S. Standards of validity and the validity of standards in performance assessment. Educ Meas Issues Pract. Winter 1995:5–8.