Share this article on:

Assessing Medical Students' Skills in Working With Interpreters During Patient Encounters: A Validation Study of the Interpreter Scale

Lie, Désirée MD, MSEd; Bereknyei, Sylvia MS; Braddock, Clarence H. III MD, MPH; Encinas, Jennifer; Ahearn, Susan; Boker, John R. PhD

doi: 10.1097/ACM.0b013e31819faec8
Physician–Patient Relationship

Purpose Interpreted patient encounters require distinct communication skills. The absence of available reliable, valid, and practical measures hinders the assessment of these skills; therefore, the authors aimed to construct and validate the Interpreter Scale (IS).

Method The authors constructed the IS based on expert consensus and prior studies. They administered the IS to two classes (n = 182) in an interpreted standardized patient (SP) case setting. Standardized interpreters in the examination room assessed, using the IS, students' communication skills. Concurrently, SPs, using the validated Patient-Physician Interaction scale (PPI) and the Interpreter Impact Rating Scale (IIRS), also assessed students' skills. Trained observers watched DVDs and used the Faculty Observer Rating Scale (FORS) to assess student performance. A prior study documented the qualities of the IIRS and FORS. The authors determined the internal consistency reliability and examined construct validity of IS scores through factor analysis and concordance with other measures' scores.

Results IS reliability analysis yielded Cronbach α = 0.77. Factor analysis demonstrated two IS dimensions. Nine items, “managing the encounter,” and four items, “setting the stage,” explained 76% and 15% of score variance, respectively. IS and FORS scores significantly correlated (r = 0.385; P < .0001). IS factor 1 scores significantly correlated (all P < .0001) with FORS (r = 0.402), IIRS (r = 0.277), and PPI (r = 0.332) scores.

Conclusions The IS has reasonable internal consistency reliability and construct validity to warrant use for formatively measuring student communication skills in interpreted SP encounters, and it needs testing in actual patient encounters.

Dr. Lie is director, Research/Faculty Development, Department of Family Medicine, University of California, Irvine, School of Medicine, Irvine, California.

Ms. Bereknyei is research assistant, Stanford University School of Medicine and program manager, National Institutes of Health/National Heart, Blood, and Lung Institute-supported National Consortium for Multicultural Education for Health Professionals, Stanford, California.

Dr. Braddock is associate dean, Medical Education, Stanford University School of Medicine, Stanford, California.

Ms. Encinas is research coordinator, Department of Family Medicine, University of California, Irvine, School of Medicine, Irvine, California.

Ms. Ahearn is director, Clinical Skills Training Center, University of California, Irvine, School of Medicine, Irvine, California.

Dr. Boker is vice president, Academic Affairs, Geisinger Health System, Danville, Pennsylvania.

Correspondence should be addressed to Dr. Lie, Department of Family Medicine, 101 The City Drive South, Bldg 200, Rm 512, Orange, CA 92868; telephone: (714) 456-5171; fax: (714) 456-7984; e-mail: (

The United States is increasingly linguistically and culturally diverse.1–3 Training future physicians to work with interpreters according to accepted standards4–6 is essential for both the development of good clinical practice and the reduction of health disparities.7,8 An increasing number of policy actions in the United States mandate minimum standards for language access for patients9 and the allocation of resources directed at meeting these requirements.10 Cultural competence curricula in medical schools11–13 highlight the need for medical students and residents to be trained in working effectively with interpreters. The Association of American Medical Colleges' Tool for Assessing Cultural Competency Training describes a skill set that comprises the knowledge and application of working effectively with interpreters as a distinct learning entity (Domain V).13,14

A recent review of language interpreter use in the emergency department setting15 found that patient satisfaction was lower among patients with limited English proficiency (LEP) compared with patients who spoke English well.16 In two studies, patients with LEP experienced lower satisfaction with the visit (less courtesy, decreased respect, and lower quality of discharge instructions) and were less likely to return for care compared with patients who spoke English proficiently.17,18 In addition, researchers have found that health outcomes and health care use are poorer for patients with LEP; doctors prescribe fewer medications, start fewer or inappropriate intravenous treatments, and order fewer or inappropriate tests and procedures, leading to subsequent increased expenses and longer stays for adult19–21 and pediatric22,23 patients. For preventive care, one study24 found that Latino patients proficient in English were twice as likely to receive a recommendation for a Pap smear from physicians compared with Latino patients with LEP.

Studies with patient satisfaction as an outcome have indicated that some patients with LEP prefer language-concordant physicians, whereas others are equally satisfied with a language-concordant provider as with a professional interpreter.25,26 For providers to be proficient in all their patients' primary languages is unrealistic, so when communicating with patients with LEP, working effectively with an interpreter is important to optimize patient satisfaction and health care outcomes.

Widely accepted models to teach patient-physician communication skills, such as the Four Habits Model27 and the Kalamazoo Consensus Conference guidelines,28 involve only the patient and physician. Such models rely primarily on aspects of behavior and engagement as viewed from the patient's28,29 or physician's perspective30,31 or both.32–34 A paucity of studies examine these skills in interpreted encounters with patients with LEP.25 Moreover, a recent study35 suggested that the positive patient subjective experience of partnership, a potential outcome of provider communication style, does not fully reflect communication skills as measured by the Patient-Physician Interaction scale (PPI), suggesting that eliciting all the complex and myriad behavioral factors involved in communication skills training requires more than simply one perspective or one measure of provider communication skills.

When an interpreter participates in a LEP encounter, considerable challenges to trainee or provider assessment arise. Patient-perceived provider rapport with the interpreter, professional behavior of the interpreter in relation to both patient and trainee or provider, primary childhood language of the provider,36 patient-provider ethnicity, and patient-provider gender concordance37,38 all are among a multitude of factors that may impact patients' assessments of communication skills in interpreted encounters. Standards are available in the United States to guide the teaching of interpreter use skills.4–6 However, no validated measures of skill in appropriate and effective work with an interpreter exist either for the training setting with standardized patients (SPs) or with real patients in an actual clinical setting, where communication skills potentially translate into patient satisfaction and care outcomes. In addition, it is unclear which participant in the encounter (i.e., the patient, interpreter, or an outside observer) is best situated to assess, in a reliable and valid way, the trainees or provider's performance in such settings and to provide effective feedback for improving future skills.

We previously conducted a pilot study39 of 23 monolingual Spanish-speaking SP encounters with third-year medical students that included an interpreter. We examined two new measures: the Interpreter Impact Rating Scale (IIRS) and the Faculty Observer Rating Scale (FORS). We found high internal consistency reliability of both the IIRS and FORS (Cronbach α = 0.90 and 0.88, respectively).39 Four faculty observers who independently rated students using the FORS demonstrated reasonable interrater reliability (intraclass correlation coefficient = 0.61).39 These findings suggest that each of these measures uniquely contributes to skills assessment in this area. Currently, however, we do not know whether these two provisional measures have practical value in actual patient encounters.

Potentially, in a trainee's actual clinical encounter, the patient, the interpreter, or a faculty observer may assess his or her skills. However, training patients with LEP to rate provider skills in effective use of interpreters is unrealistic because this training requires time away from the clinical encounter for both the patient and the interpreter. Also, patients' ratings of their providers may be unreliable because patients are prominent stakeholders in the encounter who often do not feel empowered to rate their providers.35 Faculty also require training, detracting from clinical care time, and concerns often arise about the impact of their presence as a confounder in the encounter; during these observed encounters, faculty may elicit trainees' “best testing behavior” rather than witness typical clinical actions.

A feasible alternative is to construct a skill measure that the interpreter can use. We assert that training skilled interpreters (who are already in the room) to assess learners' communication skills in both training and real clinical settings is possible. The goal in this study was to construct and validate a measure of communication skills as observed by trained interpreters, the Interpreter Scale (IS). We administered the measure in a training setting to examine both its internal consistency reliability and its construct validity. In doing so, we expected to gain insight about whether it ultimately could be applied to learner assessment and facilitate real-time feedback in the actual practice setting.

Back to Top | Article Outline


Study sample

Participants included 84 first-year medical students (MS1) at school 1 and 98 second-year students (MS2) at school 2. School 1 students received instruction focused on working with an interpreter. During a two-hour session, they observed and participated in small-group interactions that included previously trained patients and interpreters and were tested midway through the MS1 year. School 2 students practiced interpreter interaction skills with trained patients and interpreters in two separate small-group sessions, totaling about 120 minutes. They received peer and faculty feedback and were tested in the first half of the MS2 year. All students also completed an additional one-hour, online, case-based module on working with an interpreter; the module was designed to increase knowledge and improve attitudes about effective conduct of the interpreted medical interview.40 Table 1 summarizes characteristics of students, the interpreter interaction curricula, and the test cases. Participation in the standardized interpreter case was a required activity for all students, but their actual scores were not used for grading. The institutional review boards of both the University of California, Irvine and Stanford University approved the research.

Table 1

Table 1

Back to Top | Article Outline

Study measures

For this study, we developed the IS using an iterative process to gain consensus expert opinion and incorporate current practice guidelines. The multidisciplinary experts consisted of course directors, a residency program director, two clinical skills trainers, a director of medical education research, and bilingual cultural competence educators. Two principles guided IS construction: (1) focusing on objective verbal and observable behaviors and (2) minimizing interpreters' emotional responses to interactions. The resulting experimental IS consisted of 13 items. Two items required binary scoring (“yes”/“no”), and the remaining 11 Likert-type rating items asked interpreters to choose from five behaviorally anchored response options, where 1 equals marginal/low and 5 equals outstanding. The authors designed all measures for use in any encounter with the presence of an interpreter and for application in multiple disciplines including medicine, nursing, and pharmacy. (See Table 2 for the item sets comprising the IS, IIRS, and FORS.) Lastly, the PPI28 is a validated, seven-item rating scale of communication and professional skills completed by patients. It has been widely applied as a generic measure in clinical skills assessment.

Table 2

Table 2

Back to Top | Article Outline


The interpreter test cases used four languages (Spanish, Russian, Mandarin Chinese, and Vietnamese) at school 1 and two (Spanish and Vietnamese) at school 2. Local community demographics and patient-care needs determined language selection. Each interpreted encounter lasted 15 minutes (Table 1). At school 1, we embedded the interpreter case within an objective structured clinical examination (OSCE) that included other encounters and interstation exercises. At school 2, the interpreter case was a stand-alone station administered outside of an OSCE and without other encounters.

The clinical scenarios for the interpreter cases differed by school. At school 1, the chief patient presentation was the symptom of cough. At school 2, the chief presentation was seeking advice to stop smoking (Table 1). We developed the cases to be consistent with the students' skill level and pilot tested them at each school with separate MS1 and MS2 to ascertain that the task demands could be completed within a 15-minute time frame. We assigned students to case encounters in a language in which they self-reported no proficiency based on a survey conducted before the interpreter station administration.

Experienced trainers at each school applied conventional procedures to specifically train SP and interpreter pairs for the stations; the pairs consistently worked together. Similar training materials included a case script and a description of specific roles and responses in the encounter. Pairs jointly viewed student performances that we had captured on DVD for our earlier pilot study, and together they rehearsed their roles and received feedback on interpreting and using the various rating scale items pertinent to each role. We preselected training DVDs representing the full range of student performance, as measured by patient and faculty ratings. The current study involved patients and interpreters who had previously participated in and assessed students in at least one examination in the prior year. School 1 trained five pairs, each comprising SPs and professional interpreters. School 2 trained six pairs; the interpreters included three professionals and three fluent bilingual actors (Table 1).

During training, trainers directly observed and critiqued the performance of the SPs and interpreters both as individuals and as a pair in their dual-purpose roles as case actors and student raters. We required all raters to consistently score recorded training encounters with an error rate of no more than one error per encounter. Patients completed both the IIRS and PPI, as described in our earlier study.39 Interpreters completed only the experimental IS. We observed no scoring differences during training (or in pilot encounters at school 2) between the performances of the professional and bilingual actor interpreters. At school 1 only, to corroborate prior findings, monolingual English-speaking patients and SP trainers substituted for faculty observers and completed the FORS. These observers simultaneously trained with the patient-interpreter pairs. The lack of availability of trained observers at school 2 to rate all encounters precluded administering the FORS there.

Back to Top | Article Outline

Data analysis

We separately assessed the internal consistency reliability of IS, IIRS, and FORS through conventional item analysis and by computing the Cronbach coefficient alpha (α), which estimates the extent to which individual scale items measure the same construct. Additionally, we performed subgroup analysis of Cronbach α to determine the effects of language, school location, and student gender on internal consistency. To examine construct validity, we first performed unrotated principal components analysis of the IS, IIRS, and FORS scores to determine the underlying structure of each measure. Furthermore, we computed correlation coefficients among item and scale scores on the IS, IIRS, FORS, and PPI to identify overlapping items on the four measures. For all statistical analyses, we set the nominal significance level at a two-tailed α < .05. We performed all statistical analyses using Intercooled Stata 9.2 for Windows (StataCorp 2006 Statistical Software: Release 9.2, Stata Corporation, College Station, Texas).

Back to Top | Article Outline


Sample data collection

Students' mean age at school 1 was 24 years; 39 students (46%) were male. At school 2, the mean age was 23 years, and 50 students (51%) were male. All students completed the test interpreter case. At school 1, complete IS, IIRS, and FORS data were available for 82, 84, and 84 students, respectively. At school 2, 96 students provided complete data for the IS and IIRS. PPI data were available for similar numbers (84 and 98) of students at school 1 and school 2, respectively.

Back to Top | Article Outline

Item analysis and internal consistency reliability

Item analysis of the three measures produced the results shown in Table 2. Students introduced themselves to interpreters in 72% of encounters. However, students introduced interpreters to the patients only 17% of the time. Reliability analysis of the 11 Likert-type IS items yielded mean ratings ranging from a low of 2.38 ± 1.45 (IS4: “trainee explained my role to the patient”) to a high of 4.26 ± 0.82 (IS7: “trainee listened to me as I interpreted”). Analysis of IS total scores, summed across these 11 items, yielded a Cronbach α = 0.77, indicating that each item contributed to the measurement of the same construct. Examination of IS item-test correlation coefficients (the correlation between a single item score and the total scale score) indicated that the individual item contributions were statistically meaningful (range 0.53–0.68). Additional results reported in Table 2 for IIRS and FORS measures corroborate those from the authors' prior pilot study.

Back to Top | Article Outline

Subgroup analysis of internal consistency reliability

We calculated Cronbach α to estimate the impact of school, language, and student gender on the internal consistency of IS scores (Table 3). First, we computed coefficient alphas for IS data by school to determine whether increasing the power of other planned subgroup analyses warranted data pooling. This first analysis produced similar results (α = 0.87 and 0.67) and confirmed the appropriateness of combining the separate school data.

Table 3

Table 3

For subgroup analysis by language of the encounter, Cronbach α for IS scores remained approximately at α = 0.70 or higher for three languages: Mandarin Chinese (α = 0.74), Russian (α = 0.66), and Spanish (α = 0.85). Vietnamese-language encounters produced a lower value (α = 0.52). We noted little difference by student gender in Cronbach α for IS scores. We followed the identical analytic process for IIRS and FORS scores (see Table 3 for detailed results).

Back to Top | Article Outline

Scale factor analysis

Unrotated principal components analysis indicated that the 13 IS items loaded onto two factors explained 91% of the scale score variability (Table 4). That is, the underlying conceptual framework of the IS consisted of two distinguishable and independent dimensions rather than being a unitary measure of the construct of interest. Content analysis of item clusters shows that the principal factor, “managing the encounter,” incorporates nine items (nos. 5–13) and accounts for 76% of the IS score variability. Likewise, the second factor, “setting the stage,” comprises four items (nos. 1–4) and accounts for 15% of IS score variance. The latter factor comprises the behaviors, “introduced him/herself [provider] to me [the interpreter],” “introduced me [the interpreter] to the patient,” “explained the purpose of the interview,” and “explained my [the interpreter's] role to the patient.”

Table 4

Table 4

Identical analysis of IIRS data, in comparison, found that a single unitary factor accounted for 100% of scale score variance (factor loadings ranged from 0.49 to 0.77).39 Analysis of FORS data yielded a two-dimensional structure with the principal underlying factor explaining 82% of score variance (factor loadings ranged from 0.23 to 0.80).39

Back to Top | Article Outline

Correlations among scales and subsets of items

We examined bivariate associations between IS total and factor scores and other measures by computing Pearson product-moment correlation coefficients (Table 5). IS total scores significantly correlated with those from IIRS (r = 0.216; P = .004), FORS (r = 0.385; P < .0001), and PPI (r = 0.257; P = .001). IS factor 1 (managing the encounter) scores significantly correlated (all P < .0001) with IIRS (r = 0.277), FORS (r = 0.402), and PPI (r = 0.332) total scores. Similarly, scores from the same factor also significantly correlated with single global rating items included as the last item on the IS, IIRS, and FORS (r = 0.61, 0.38, and 0.34, respectively; all P < .0005; data not shown). IIRS and PPI total scores highly overlapped (r = 0.90; P < .0005; data not shown). No other correlation coefficients (whether by item scores, factor scores, or scale scores) yielded values greater than 0.20 (data not shown).

Table 5

Table 5

Back to Top | Article Outline

Discussion and Conclusions

In this study, we examined the performance of the new 13-item IS, a rating scale completed by the interpreter after a clinical encounter. We accomplished this by examining both its inherent structure and its associations with the previously studied IIRS and FORS39 and the validated PPI.28 We determined the internal consistency reliability and provisionally established a degree of construct validity of the IS for evaluating student performance in standardized clinical encounters involving interpreters. We recommend using either the IS dimension score for managing the encounter or the IS global item score from trained interpreters to assess trainees and provide formative feedback. Our findings also corroborate and extend our prior work39 in documenting the reliability and construct validity of the IIRS and FORS with different samples, as we again found high internal consistency and interscale correlations and examined the internal structure of the two measures.

Factor analysis of the new IS measure showed that skills for managing the encounter (IS factor 1) are distinct from those for setting the stage (IS factor 2). The former dimension accounts for the majority of IS score variability and correlates positively with the total scale scores and global item scores included in two measures completed by patients (IIRS and PPI) in real time and a third completed post hoc by observers (FORS). This pattern of significant positive correlations provides some evidence of convergent validity in the context of observed consistency of skill assessments in interpreted encounters as viewed by different types of raters. Indeed, behavior-specific items for IS factor 1 are similar to those in both IIRS and FORS (e.g., eye contact, use of the first person when asking questions of the patient, and closing the encounter by asking for questions from the patient).

Students may neglect to set the stage because of a prior established relationship with the interpreter and, yet, still do well in managing the encounter. Managing the encounter is a skill set more directly relevant to clinical practice and patient outcomes. Setting the stage may become more important for encounters in which interpreters and providers lack familiarity with one another.

In this study, the interpreter, present in the room, had access to the patient's subtle nuances of emotional reactions and could sense the emotional temperature. However, the trained observers at school 1 viewed a recording of the encounter after the fact. The latter experience may have produced less accurate assessments of behaviors—including eye contact and seating arrangement—because of the angle of the camera and a tendency to make ratings weighted more heavily on the content of verbal communication emanating from the interpreter.

The interpreters were bilingual and, naturally, had more access to the communication content exchanged between students and patients. The post hoc observers at school 1 were not bilingual. Thus, in a real (versus standardized) clinical encounter, interpreters themselves could be more appropriate raters of providers' skills in working with an interpreter during interpreted patient encounters. When provided with a realistic and practical degree of training (as provided in this study), we believe that professional interpreters could make reliable and valid assessments of trainee skills in effectively working with interpreters. We have planned a future study employing trained professional and ad hoc interpreters to rate learners in clinics compared with untrained professional and ad hoc interpreters.

Our study has several positive features. The sample included students at similar stages of training from two schools. They received similar exposure—measured in both content overlap and hours of instruction—to curriculum focused on working with interpreters. In choosing preclinical students, we expected that they would be more open to communication skills training. That is, we expected better results in isolating distinct skills for interacting with interpreters during concurrent early clinical skills training, such as history taking and counseling. In contrast, we viewed this process as more problematic when it occurred with more clinically experienced students who had already received training in more complex counseling, clinical reasoning, and physical examination skills. Using higher-level communication skills might have confounded our assessment of interpreter interaction skills. We compensated for students' lesser clinical experience by using relatively simple clinical scenarios.

We feel that we had a robust, yet practical, study design and obtained complete data on the standardized encounters. Patient-interpreter pairs received rigorous, similar training at the two schools. Raters used identical measures and demonstrated, a priori, their accuracy and consistency during training. The study used four languages, and meaningful sample sizes facilitated subgroup analyses.

Our findings should be interpreted within the limitations of the study design. First, we collected FORS data at just one school. Also, SP trainers and SPs, not faculty who teach how to work with an interpreter, provided FORS data. This departs from the authors' prior study39 that used clinical faculty to provide provisional validation data. Second, one half of interpreters at one school were bilingual actors and not professional interpreters. Although a potential confounder, our training quality control did not find performance differences between different interpreter types. Last, we studied the new IS measure in a setting of in-person interpretation. Results may not apply to telephone- or video-based interpretation41 in which the interpreter is not in the room with the patient and provider.

This study has several implications. The typical relationship between interpreters and providers, with minimal engagement before and after the clinical encounter, currently limits interpreters to a service role. Assessment of health professionals and trainees by interpreters may contribute to achieving team-based care42,43 while concomitantly improving recognition of the unique and important role of interpreters in practice. The ability of interpreters to assess trainees in actual clinical practice also may serve to decrease the faculty burden for assessment. Increased involvement of interpreters in patient care has potential as a process measure of patient-centered care.

Consistent with a team-based care model, we suggest that the collaborative relationship between trainee or provider and interpreter be optimally described as “work effectively with the interpreter” instead of “using the interpreter.” Ample literature exists citing an adverse impact of poor provider communication on patient satisfaction and health outcomes in LEP encounters. We posit that the academic medicine community needs more studies to directly link interventions aimed at provider communication skills with improved patient satisfaction and care outcomes in special populations such as patients with LEP.44 Our study makes one small contribution toward meeting that challenge.

Back to Top | Article Outline


This project was supported in part by a grant from the National Institutes of Health (NIH), National Heart, Blood, and Lung Institute, award #K07 HL079256-01, entitled “An Integrative, Evidence-Based Model of Cultural Competency Training in Latino Health Across the Continuum of Medical Education” (2004–2009), and award #K07 HL079330-04, entitled “Integrated Immersive Approaches to Cultural Competence.” The contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIH.

Back to Top | Article Outline


1 US Census Bureau S1602. Linguistic Isolation Data Set: 2005–2007 American Community Survey 3-Year Estimates Survey: American Community Survey. Available at: ( Accessed March 12, 2009.
2 U.S. Census Bureau. Language Use and English-Speaking Ability: 2000. Census 2000 Brief. Available at: ( Accessed January 7, 2009.
3 ProEnglish. Why official English? Available at: ( Accessed January 7, 2009.
4 U.S. Department of Health and Human Services, Office of Minority Health. National Standards for Culturally and Linguistically Appropriate Services in Health Care: Final Report. Available at: ( Accessed January 7, 2009.
5 Civil Rights Act of 1964 (Title VII). PL 88-352, 88th Congress, H. R. 7152. July 2, 1964.
6 Grubbs V, Chen AH, Bindman AB, Vittinghoff E, Fernandez A. Effect of awareness of language law on language access in the health care setting. J Gen Intern Med. 2006;21:683–688.
7 Karliner LS, Pérez-Stable EJ, Gildengorin G. The language divide. The importance of training in the use of interpreters for outpatient practice J Gen Intern Med. 2004;19:175–183.
8 Flores G. The impact of medical interpreter services on the quality of health care: A systematic review. Med Care Res Rev. 2005;62:255–299.
9 Youdelman MK. The medical tongue: U.S. laws and policies on language access. Health Aff (Millwood). 2008;27:424–433.
10 The George Washington University, School of Public Health and Health Services; Robert Wood Johnson Foundation. Speaking Together: National Language Services Network Web site. Available at: ( Accessed January 7, 2009.
11 Betancourt JR. Cross-cultural medical education: Conceptual approaches and frameworks for evaluation. Acad Med. 2003;78:560–569.
12 Flores G, Gee D, Kastner B. The teaching of cultural issues in U.S. and Canadian medical schools. Acad Med. 2000;75:451–455.
13 American Association of Medical Colleges. Tool for Assessing Cultural Competence Training (TACCT). Available at: ( Accessed January 7, 2009.
14 Lie D, Boker J, Cleveland E. Using the tool for assessing cultural competence training (TACCT) to measure faculty and medical student perceptions of cultural competence instruction in the first three years of the curriculum. Acad Med. 2006;81:557–564.
15 Ramirez D, Engel KG, Tang TS. Language interpreter utilization in the emergency department setting: A clinical review. J Health Care Poor Underserved. 2008;19:352–362.
16 Carrasquillo O, Orav EJ, Brennan TA, Burstin HR. Impact of language barriers on patient satisfaction in an emergency department. J Gen Intern Med. 1999;14:82–87.
17 Jacob EA, Sadowski LS, Rathouz PJ. The impact of an enhanced interpreter service intervention on hospital costs and patient satisfaction. J Gen Intern Med. 2007;22(suppl 2):306–311.
18 Sarver J, Baker DW. Effect of language barriers on follow-up appointments after an emergency department visit. J Gen Intern Med. 2000;15:256–264.
19 Waxman MA, Levitt MA. Are diagnostic testing and admission rates higher in non-English-speaking versus English-speaking patients in the emergency department? Ann Emerg Med. 2000;36:456–461.
20 Bard MR, Goettler CE, Schenarts PJ, et al. Language barrier leads to the unnecessary intubation of trauma patients. Am Surg. 2004;70:783–786.
21 Bernstein J, Bernstein E, Dave A, et al. Trained medical interpreters in the emergency department: Effects on services, subsequent charges, and follow-up. J Immigr Health. 2002;4:171–176.
22 Hampers LC, McNulty JE. Professional interpreters and bilingual physicians in a pediatric emergency department: Effect on resource utilization. Arch Pediatr Adolesc Med. 2002;156:1108–1113.
23 Lassetter JH, Baldwin JH. Health care barriers for Latino children and provision of culturally competent care. J Pediatr Nurs. 2004;19:184–192.
24 De Alba I, Sweningson JM. English proficiency and physicians' recommendation of Pap smears among Hispanics. Cancer Detect Prev. 2006;30:292–296.
25 Ngo-Metzger Q, Sorkin DH, Phillips RS, et al. Providing high-quality care for limited English proficient patients: The importance of language concordance and interpreter use. J Gen Intern Med. 2007;22(suppl 2):324–330.
26 Green AR, Ngo-Metzger Q, Legedza AT, Massagli MP, Phillips RS, Lezzoni LI. Interpreter services, language concordance, and health care quality. Experiences of Asian Americans with limited English proficiency. J Gen Intern Med. 2005;20:1050–1056.
27 Stein T, Frankel RM, Krupat E. Enhancing clinical communication skills in a large healthcare organization: A longitudinal case study. Patient Educ Couns. 2005;58:4–12.
28 Makoul G. Essential elements of communication in medical encounters: The Kalamazoo consensus statement. Acad Med. 2001;76:390–393.
29 Norcini JJ, Blank LL, Duffy FD, Fortna GS. The mini-CEX: A method for assessing clinical skills. Ann Intern Med. 2003;138:476–481.
30 Brown JB, Boles M, Mullooly JP, Levinson W. Effect of clinician communication skills training on patient satisfaction. A randomized, controlled trial. Ann Intern Med. 1999;131:822–829.
31 Holmboe ES, Huot S, Chung J, Norcini J, Hawkins RE. Construct validity of the miniclinical evaluation exercise (mini-CEX). Acad Med. 2003;78:826–830.
32 Levinson W, Roter D. Physicians' psychosocial beliefs correlate with their patient communication skills. J Gen Intern Med. 1995;10:375–379.
33 Braddock CH 3rd, Fihn SD, Levinson W, Jonsen AR, Pearlman RA. How doctors and patients discuss routine clinical decisions. Informed decision making in the outpatient setting. J Gen Intern Med. 1997;12:339–345.
34 Braddock CH 3rd, Edwards KA, Hasenberg NM, Laidley TL, Levinson W. Informed decision making in outpatient practice: Time to get back to basics. JAMA. 1999;282:2313–2320.
35 Saba GW, Wong ST, Schillinger D, et al. Shared decision making and the experience of partnership in primary care. Ann Fam Med. 2006;4:54–62.
36 Fernandez A, Wang F, Braveman M, Finkas LK, Hauer KE. Impact of student ethnicity and primary childhood language on communication skill assessment in a clinical performance examination. J Gen Intern Med. 2007;22:1155–1160.
37 Saha S, Komaromy M, Koepsell TD, Bindman AB. Patient-physician racial concordance and the perceived quality and use of health care. Arch Intern Med. 1999;159:997–1004.
38 Cooper-Patrick L, Gallo JJ, Gonzales JJ, et al. Race, gender, and partnership in the patient-physician relationship. JAMA. 1999;282:583–589.
39 Lie D, Boker J, Bereknyei S, Ahearn S, Fesko C, Lenahan P. Validating measures of third year medical students' use of interpreters by standardized patients and faculty observers. J Gen Intern Med. 2007;22(suppl 2):336–340.
40 Kalet AL, Mukherjee D, Felix K, et al. Can a web-based curriculum improve students' knowledge of, and attitudes about, the interpreted medical interview? J Gen Intern Med. 2005;20:929–934.
41 Jones D, Gill P, Harrison R, Meakin R, Wallace P. An exploratory study of language interpretation services provided by videoconferencing. J Telemed Telecare. 2003;9:51–56.
42 Lavizzo-Mourey R. Improving quality of US health care hinges on improving language services. J Gen Intern Med. 2007;22(suppl 2):279–280.
43 Carmona RH. Improving language access: A personal and national agenda. J Gen Intern Med. 2007;22(suppl 2):277–278.
44 Wynia M, Matiasek J. Promising Practices for Patient-Centered Communication With Vulnerable Populations: Examples From Eight Hospitals. Available at: ( Accessed January 7, 2009.
© 2009 Association of American Medical Colleges