Previous research has demonstrated that physician empathy is correlated with professionalism and favorable patient outcomes1,2; thus, empathy training is a vital area of research and intervention. Empathy is a complex phenomenon, conceptualized as having an affective component (the ability to share emotional experiences), a cognitive component (understanding the emotions of another person), and a behavioral component (the clinician’s verbal and nonverbal expression of empathy toward the patient).3,4
The current wave of modern technology used in education and medical practice requires physicians and trainees to be technologically savvy but does little to promote empathic communication.5–7 Medical students self-report a decline in empathy, especially in the third year of medical school, when clinical rotations begin.8 Women and physicians who practice in primary care specialties show higher empathy in patient encounters, as measured with self-reported and expert-rated instruments.9,10 Current medical school curricula address both the emotion and the cognitive and behavioral aspects of empathy using experiential learning such as patient shadowing, communication skills training, or wellness programs.3,4 Patient shadowing was defined as “having a committed and empathic observer follow a patient and family through their care experience”11,12 and has been integrated in medical school curriculum and patient care. Patient shadowing interventions involved students acting as “patient navigators” for patients during clinic visits13 or student volunteers portraying physical symptoms and being cared for by residents who were unaware of the experimental nature of the hospitalization.14 Following each type of intervention, students reported becoming keenly aware of the importance of empathy in caring for patients.13,14 Patient shadowing has also been successfully integrated in the patient- and family-centered care model,11,12 leading to broad system improvements in the delivery of care. Communication skills workshops involving lecture, role-play, and patient interviews with feedback on communication skills, including empathy, have consistently been proven effective in teaching empathic communication.3,4 Individual feedback from faculty is a central component of these workshops, and it can be given live, immediately after the patient encounter, or it can be based on recorded patient interactions. These teaching methods are threatened by limited the availability and high cost of standardized patients (SPs) and limited access to faculty.15 In response to these limitations, we created and used virtual patients (VPs) to allow students to practice history taking and empathic communication in medical conditions such as major depression, bipolar disorder, abdominal pain, and cranial nerve palsy.16–19 Virtual patients offer advantages over live patient interactions: (1) they can simulate clinical scenarios that are critical but not frequently encountered, (2) they can allow the trainee time to think about their response to questions, (3) they can offer standardized content, (4) they can provide immediate feedback after the interaction, and (5) they can allow safe and repetitive practice.16–18 Virtual patients have limitations as well: they are less realistic than live patients, at times have difficulty recognizing questions, and cannot detect user’s nonverbal cues. However, question recognition improves with repetitive VP use.17,18 In our previous work, we showed that interacting with a VP portraying a critical scenario (in this case, a suicide attempt) can enhance medical students’ suicide risk assessment proficiency in a live interview with an SP. We also previously demonstrated that trainees show verbal empathy toward VPs with cranial nerve injury20 and abdominal pain21 as well as VPs with depression and bipolar disorder.22 Using the Empathic Communication Coding System (ECCS),9 we coded medical students’ empathy in interactions with VPs with bipolar disorder and major depression and found that students responded with increased empathy as they progressed in training.22 For the study described here, we enhanced a VP scenario to simulate interventions traditionally used to teach empathy. In our study, 3 separate groups of medical students interacted with (1) a control VP portraying depression, (2) a VP with a backstory intended to simulate patient shadowing, or (3) an empathy-feedback VP, capable of giving students immediate feedback about their empathic communication, following the VP interaction. We hypothesized that, compared with the control VP interaction, one or both of our novel VP interventions will enhance verbal empathy in students’ subsequent encounters with real humans (SPs). Our primary outcome was the students’ verbal response to the opportunities to show empathy presented to them during the SP encounters.
Institutional Review Board Approval, Consent Procedures, and Study Population
Recruitment was sought among first-year students of the Medical College of Georgia based on an institutional review board–approved research protocol. The students were contacted through mass e-mails, flyers, or private word of mouth to ensure anonymity. Informed consent was obtained through an institutional review board–approved consent form. The students were compensated with $20 for the time and travel resources spent to participate in the research. The study results were deidentified and remained confidential. All study activities were conducted in a secure area with access limited to research staff.
Cynthia Young VP scenario was created and validated by our team.17 The scenario was used in various research and educational applications.22,23
In the study described herein, students interacted with Cynthia using an online text-based interface. They conducted interviews as they would with live patients, but they typed what they wanted to say rather than speaking. Cynthia responded via written text and was represented as a static image. Cynthia Young VP version used in this study had approximately 500 unique character speeches (statements made by the character) and recognized approximately 4500 speech triggers for the speech matching algorithm to process user input. We used a Natural Language Processing algorithm to allow our VP to respond to questions. Our Natural Language Processing algorithm24 uses 10 different measurements of sentence similarity to determine if what was said is a paraphrase of any of the questions that the VP contains in its structured set of questions and answers. All similarity measures are provided to a machine learning model that classifies the likelihood that an input question is a paraphrase. This classification and analysis occur after simple processes such as spellchecking are performed. Cynthia Young VP scenario portrays a 21-year-old college student, referred by her campus counselor, who presents to the doctor with symptoms of major depressive episode after losing her beloved cousin in an accident 8 months before the interview.17 The scenario allows the user to collect information about the chief complaint, history of present illness including symptoms of major depression, social history, as well as stressors, medical, and family history. The VP scenario content was identical for all study groups and included predetermined empathic opportunities inserted in the scenario by the editors. The scenario was technically enhanced with empathy feedback or a backstory for the respective intervention groups.
The Empathy-Feedback VP
Human-assisted empathy feedback is a new crowdsourcing technique where human “assessors” anonymously follow online the trainee’s interaction with the VP in real time. The assessors were not aware of the students’ identity or study group assignment and could not see the students, and the students were not aware of the assessors’ presence. When the student encountered opportunities to express empathy to the VP, the assessors observed the typed interaction on their computer screen and rated the student’s empathic responses (Fig. 1). This feature was added to the Cynthia Young VP scenario described earlier.
The empathy-feedback page, available to the student for review at the end of the VP interaction, contained student’s coded empathic responses and offered potential response alternatives.22,23 For the purpose of this study, assessors were PhD student volunteers and study coinvestigators who provided hidden assistance for free. They were trained to use the relevant empathy scale (ECCS) beforehand and achieved a high interrater reliability (for training details, see the Study Outcomes section). However, these assessors could be recruited in real time from crowdsourcing platforms such as Mechanical Turk25 and paid as little as $0.05 per question assisted, can be provided the same training as that provided to the volunteers in this study, and can be disallowed on the basis of minimum qualifications such as quality and amount of previous work. The VP is governed by a question answering algorithm that can function independently of the presence of hidden assessors. Further technologic enhancements can be created, such that the empathy feedback can be provided directly though the VP interaction. Hidden assessors were used in our study, to explore the feasibility of the empathy enhancements.
The Backstory VP
The creation of a backstory is a new approach to VPs, which combines embodied conversational agents26,27 and narrative video vignettes. When specific questions are asked of the VP, noninteractive video vignettes are presented, which show scenes of the VP illustrating their condition.27 For example, when Cynthia Young is asked, “Tell me about your diet,” the text “I eat a lot. All I do is eat and sleep” is displayed as a response, and then, a 14-second video shows Cynthia eating ice cream and then taking a nap. We created the backstory VP using the Sims 3 video game, a platform that is widely used to create videos of virtual humans. Each video aims to show medical students how patients’ home life can be greatly impacted by their illness and thus simulates patient shadowing.11,12 The cost in creating the backstory consists in purchasing the Sims 3 software by Electronic Arts and the time to create the video backstories, roughly 1 hour per cut-scene, with 4 cut-scenes inserted in the VP interaction.
The Control VP
The control VP provided only a typed interaction with Cynthia Young, without empathy feedback or patient backstory.
Study Flow and Evaluation of Study Outcomes
First-year medical students were randomly assigned to one of the following distinct groups:
- A control group who interacted with the control VP.
- A backstory group who interacted with the backstory VP.
- An empathy-feedback group who interacted with the empathy-feedback VP.
After the interaction with the VP, students in each study group interacted with SPs playing the same scenario. Based on previous literature,9,10,28 the following variables could have influenced the students’ empathy: (1) sex, (2) race, (3) medical specialty of choice, and (4) mental health experience (because the VP used in the intervention had depression). Outcomes of the 3 VP interventions on first-year medical students’ verbal empathy in their interaction with a real human (SP) were assessed by trained assessors and by the SPs (Fig. 2).
Outcomes Measured by Trained Assessors
- The primary outcome was the students’ verbal response to all the opportunities to show empathy presented to them by the SPs, which we refer to as the mean empathy score.
- Empathy scores by type of empathic opportunity were determined as secondary outcomes.
- All empathic opportunities without symptom-type opportunity
- Symptom-only empathic opportunities, defined as symptoms of depression described by the patient, which could also be interpreted by the interviewer as empathic opportunities.
- Emotion-only empathic opportunities, defined as patient describing himself or herself currently feeling an emotion, for example, feeling sad.
- Challenge-only empathic opportunities, defined as patient describing the negative effect of a physical/psychosocial problem on his or her quality of life.
- Furthermore, we studied whether the interaction between the study group (control VP, backstory VP, and empathy-feedback VP) and the type of empathic opportunity affects the empathy score.
Empathic Communication Coding System Assessor Training and Interrater Reliability
In the following, we will explain the instrument used to measure empathy and how we used it in this study. The ECCS was developed to code empathic opportunities, defined as an explicit, clear, and direct statement of emotion, progress, or challenge by the patient.9,29 As described elsewhere,22 we added the “symptom” empathic opportunity category to capture statements describing symptoms of depression, which could also be interpreted as empathic opportunities (Table 1). The ECCS also codes physicians’ verbal responses to these opportunities ranging from level 6 (the highest level of empathy, where the clinician makes a statement that he or she shares the patient’s emotion) to level 0, denial of patient’s perspective (the clinician ignores the empathic opportunity), as described in Table 1. The assessors underwent extensive training in ECCS coding and achieved final interrater reliability of 0.81, measured by intraclass correlation.30
Data Collection and Coding During the Study
Empathy coding was performed using ECCS, both in the intervention stage of the study, when the assessors coded the students’ empathic responses during the empathy-feedback VP interaction (Fig. 1), and in the assessment stage, when the assessors coded the primary study outcome, students’ empathic responses in the interactions with real humans (in our case, SPs). Each live SP-student interaction was videotaped and transcribed by a study investigator. Another investigator verified random transcripts for accuracy. We extracted all student responses to the predetermined empathic opportunities within the deidentified SP interaction transcripts. Measures were taken to label the transcripts in each study group (backstory VP, empathy-feedback VP, and control VP) such that the source of the transcripts was not identifiable to the assessors.
Outcomes Measured by SPs
Outcomes measured by SPs included the following:
- Standardized patient communication checklist and the Medical Student Interview Performance Questionnaire (MSIPQ) mean rapport score.
- The students’ history taking skills, using the SP symptom checklist.
Standardized Patient Communication Checklist and Symptom Checklist
The SP With Depression
Standardized patients are real humans who portray clinical scenarios. Ryan Higgins31 is an SP case portraying a 25-year-old soldier who discloses symptoms of major depression and alcohol abuse32 after the death of his sister, with whom he was very close. Three experienced SPs from Georgia Regents University’s Clinical Skills Center were trained and tested by authors and center staff before portraying Ryan Higgins in the study. Our SPs were slightly older than the age of the patient whose role they played and were experienced in playing a variety of young male roles in our Clinical Skills Center. As part of the training, we informed the SPs that a real-life possibility of suicide contagion exists by playing such role (albeit in younger individuals33) and that resources were available in case they were to develop such thoughts. After the completion of the study, none of the SPs reported feeling disturbed by playing the character. The SPs were blinded to the students’ study group assignment and evaluated the student interactions using 2 instruments as follows:
- The SP “communication checklist” is a 14-item list (7 items are rated as “yes” or “no” and the others as “agree” or “disagree”) capturing the medical students’ professional appearance, behavior, empathy, and rapport, and is routinely used for medical student–SP interactions in Georgia Regents University’s Clinical Skills Center. In addition, we used the 5-item “Rapport Subscale” of the MSIPQ,34 a reliable instrument (Cronbach α = 0.79), which predicts patient satisfaction. The subscale items evaluate whether the student (1) seems stiff and unnatural, (2) appears warm and caring, (3) seems nervous and confused, (4) seems comfortable talking to the SP, or (5) talks down to the SP, on a 5-point scale (disagree, somewhat disagree, neutral, somewhat agree, agree where disagree = 1 and agree = 5). Items 1, 3, and 5 were reverse scored. The maximum rapport score by summing up the items was 15.
- The SP “symptom checklist” is a 20-item questionnaire (rated as “yes” or “no”), indicating whether the medical students asked the SPs about symptoms of depression, alcohol use, as well as medical, social, and family history32 (presented in Table 1, Supplemental Data Content, http://links.lww.com/SIH/A261).
Student Satisfaction Outcome
An additional way to assess feasibility of the intervention was to obtain students’ satisfaction with the VP through a VP satisfaction survey completed online by students at the end of the VP interaction. The VP satisfaction survey had 31 items and was implemented online using Survey Monkey.35 Six items were adapted from the Maastricht Assessment of Simulated Patients36 (eg, the Maastricht Assessment of Simulated Patients item “SP appears to withhold information unnecessarily” became “The VP appeared to withhold information”). Other items inquired specifically about VP-specific items (eg, the transcript), familiarity with mental illness,28 demographic data, students’ perception of the VP’s IQ, and the medical specialty they plan to pursue. The familiarity with mental illness28 component had 5 questions rated as yes = 1 and no = 0.
The student empathic response data from the deidentified SP interactions, the SP communication and symptom checklists, and the VP satisfaction survey data were stored on a research drive accessible only to study investigators. The research coordinators distributed the data on student empathic response to be coded by the trained, reliable assessors, who were blinded to the study groups.
All statistical analyses were performed using SAS 9.4, and statistical significance was assessed using an α level of 0.05 unless otherwise noted. Descriptive statistics were calculated on all variables overall and by group (empathy-feedback VP, backstory VP, control VP) for demographics, outcomes assessed by trained assessors, and by outcomes assessed by SPs.
Differences Among the 3 Groups for Demographic Variables
Differences among the 3 groups for demographic variables were examined using χ2 tests or 1-way analysis of variance (ANOVA). If differences were found between groups with the 1-way ANOVA models, a Tukey-Kramer multiple comparison procedure was used to examine post hoc pairwise differences among the 3 groups.
Outcomes Evaluated by Trained Assessors
We compared the outcomes coded by trained assessors (number of empathic opportunities, empathy scores overall, and empathy scores by type of empathy opportunity) between groups using 1-way ANOVA (number of empathic opportunities) or analysis of covariance (ANCOVA, empathy scores overall and by type) controlling for the number of opportunities elicited by each student during the SP interaction. To examine whether sex, race, specialty, mental health experience score, and the MSIPQ rapport score were significant covariates for empathy scores, ANCOVA was used. To examine whether there were differences in empathy scores between groups (empathy-feedback VP, backstory VP, control VP) by opportunity type (challenge, emotion, and symptom), a repeated-measures mixed model was used with main effects of group, opportunity type, and the interaction of group and opportunity type. An unstructured correlation structure between opportunity types was assumed. Subject nested within group was considered a random effect, and the repeated measure for each subject was the opportunity type. A Tukey-Kramer multiple comparison procedure was used to control for the multiple post hoc pairwise comparisons.
Outcomes Assessed by SPs
To examine differences in outcomes assessed by SPs (communication checklists, MSIPQ, and history-taking skills using the symptom checklist), 1-way ANOVA was used. Again, if the 1-way ANOVA showed that differences existed between groups, a Tukey-Kramer multiple comparison procedure was used to examine post hoc pairwise differences. We examined each checklist item to guide future research and development and not to make overarching conclusions or inferences. Thus, no adjustment to the overall α level was made for these independent checklist items.
Effect sizes (Cramer’s V for χ2 tests and η for 2-sample t tests), test statistics, and P values are presented. Effect sizes measured with Cramer’s V for χ2 tests were interpreted as small if 0.10, medium if 0.30, and large if 0.50.
Figure 2 shows the study flow. All students completed all elements of the study. All available data were used in the analysis. The VP student satisfaction survey (31 items in total) and the SP checklist (34 items in total) were not set up for mandatory responses. The students and SPs left occasional responses blank because of lack of recollection or inattention, and thus, occasional data on these instruments were missing.
Descriptive statistics overall and by group as well as the results of the χ2, Fisher exact or t test for differences between groups for all variables are given in Table 2. The majority of the students were male and white, with a mean (SD) age of 23 (1.7) years. The students attributed a mean IQ of 103 to the VP. Only one student indicated they were of Hispanic ethnicity. There were no statistically significant differences in demographics between groups.
Number of Empathic Opportunities Elicited by Students in the SP Interactions
The students in the empathy-feedback VP group elicited significantly more empathic opportunities than students in the backstory VP and control VP groups (P = 0.0005) (Table 3). Because the number of opportunities elicited by students in each group was significantly different, we controlled for the number of opportunities in all subsequent SP analyses.
Students’ Empathy Toward the SP Overall and by Type of Empathic Opportunity (Emotion, Symptom, or Challenge)
Empathy Toward the SP Overall (Primary Outcome):
The students in the empathy-feedback VP group [mean (SD), 2.91 (0.16)] showed significantly higher overall empathy (P = 0.0277) in the SP interaction than students in the backstory [mean (SD), 2.20 (0.22)] and control [mean (SD), 2.27 (0.21)] groups in all instances, but only the difference between the empathy-feedback VP [mean (SD), 2.91 (0.16)] and the backstory VP [mean (SD), 2.20 (0.22)] groups was statistically significant. When empathic opportunities related to depressive symptoms were eliminated from the analysis and when emotion and challenge opportunities, respectively, were analyzed alone, students in the empathy-feedback VP group [mean (SD), 3.12 (0.87)] showed significantly higher empathy (P = 0.0095) than those in the backstory group [mean (SD), 2.46 (0.58)]. When empathy scores were analyzed only for symptom opportunities, the empathy-feedback VP group [mean (SD), 2.62 (0.82)] showed significantly higher (P = 0.0117) empathy than the backstory [mean (SD), 2.05 (0.92)] and control groups [mean (SD), 2.01 (0.70)] (Table 3). Race, sex, specialty, mental health experience scores, or MSIPQ rapport scores (data not shown) were not significant covariates, and the models were reduced to the ANCOVA controlling for number of opportunities.
Empathy Toward the VP by Type of Empathic Opportunity (Secondary Outcome)
We found a statistically significant interaction between study group and opportunity type in the results of the repeated-measures mixed models examining differences in students’ empathy in the SP interaction (F4,80 = 3.56, P = 0.0110) (Fig. 3). Within the empathy-feedback VP group and within the control VP group, the empathy scores were not significantly different between opportunity types (Fig. 3). The students in the backstory VP group responded with significantly lower empathy to emotion-type [mean (SD), 0.76 (0.38)] than to challenge-type [mean (SD), 2.70 (0.27)] (Tukey-Kramer adjusted P < 0.0001) and symptom-type [mean (SD), 2.07 (0.27)] (Tukey-Kramer adjusted P = 0.0375) opportunities. Within challenge opportunity, the empathy-feedback group [mean (SD), 3.37 (0.19)] responded with higher mean empathy score than did the backstory [mean (SD), 2.70 (0.27)] and control [mean (SD), 2.65 (0.27)] groups, although the difference was not significant (Fig. 3). Within symptom opportunity, the empathy-feedback group [mean (SD), 2.71 (0.22)] responded with higher mean empathy score than did the backstory [mean (SD), 2.07 (0.27)] and control [mean (SD), 2.00 (0.29)] groups, although the difference was not significant (Fig. 3). Within emotion-type opportunities, the students in the backstory group [mean (SD), 0.76 (0.38)] showed significantly lower empathy than those in the empathy-feedback [mean (SD), 2.48 (0.26)] (Tukey-Kramer adjusted P = 0.0025) and the control [mean (SD), 2.39 (0.34)] groups (Tukey-Kramer adjusted P = 0.0214) (Fig. 3).
Standardized Patients’ Rating of Students’ Communication and History-Taking Skills
Standardized Patient Communication Checklist
There were no differences in most of the communication checklist items or the overall MSIPQ rapport scale scores between groups. However, 2 variables on the SP communication checklist and 1 variable within the MSIPQ rapport subscale did show statistically significant differences between groups. Students in the control group were less likely to have “offered encouraging, supportive, and/or empathetic statements” (control, 58%; backstory, 94%; empathy-feedback VP, 100%; P < 0.0001; effect size, 0.53), were less likely to have “developed a good rapport” with the SP (control, 76%; backstory, 95%; empathy-feedback VP, 100%; P = 0.0048; effect size, 0.37), and were less likely to appear warm and caring (control, 35%; backstory, 66%; empathy-feedback VP, 80%; P = 0.0157; effect size, 0.38) than those in the backstory and those in the empathy-feedback VP groups.
Standardized Patient Symptom Checklist
There were no significant differences between groups in students’ history-taking skills (see Table 1, Supplemental Digital Content 1, http://links.lww.com/SIH/A261, showing the descriptive statistics for depression SP Ryan Higgins symptom checklist overall and by group).
Students’ Satisfaction With the VP Interaction
Table 2 (Supplemental Digital Content 2, http://links.lww.com/SIH/A262) shows the descriptive statistics for students’ satisfaction with the VP interaction.
On a 5-point Likert-type scale (poor, fair, average, good, and excellent where poor = 1 and excellent = 5), the students’ satisfaction with the VP ranged between 2.4 and 4.4 for various survey items. For example, students’ satisfaction with the overall VP interaction was rated with a mean (SD) of 3.30 (1.0) (survey question “How do you rate the overall interaction?”). For complete data on students’ satisfaction with the VP, please see Table 2 (Supplemental Digital Content 2, http://links.lww.com/SIH/A262). The following comment exemplifies the students’ views: “Sometimes, I felt that I had discovered something, like concentration difficulties, but the program didn’t record it. However, I think it is a nice way to practice interviewing patients. I felt less anxious about making mistakes and had more time to gather my thoughts.”
We show that a human-assisted intervention capable of offering immediate feedback on empathy after the VP interaction is associated with overall increased empathy as demonstrated by students (1) eliciting significantly more empathic opportunities in an SP interaction and (2) showing higher empathy in response to these opportunities compared with the backstory group, although these changes were statistically significant only in some of the cases. This finding mirrors the results found in live interventions,3,4 which consistently show that interview feedback increases empathy. The students in the backstory VP group did not differentiate from the control VP group based on empathy rating by our trained assessors. Among the outcomes rated by SPs, the MSIPQ rapport score was not significantly different between groups. However, on communication checklist items that specifically refer to empathy, the ability to offer empathetic statements, forming rapport, and appearing warm and caring, the students in both the empathy-feedback VP and backstory VP groups scored significantly higher than the students in the control VP group. In this study, the first-year medical students who interacted with the empathy-feedback VP responded with a mean empathy score of 3.35 to the challenge-type opportunities offered by SPs (rated on a scale from 0 to 6), suggesting that the students acknowledged and sometimes asked follow-up questions when encountering a challenge-type empathic opportunity presented by the SP. This response is similar with the mean empathy in real clinician-patient interactions, coded with the same instrument.9,29,37 The cost in creating the backstory VP was minimal. The empathy-feedback VP was supported with minimal cost as well. The costs include training of the assessors and their presence during the duration of the study (approximately 14 days). The study data are currently being analyzed to train a natural language classifier. This classifier will find unique features that we anticipate will allow future systems to automatically rate empathetic statements typed by the user on the ECCS scale. Once this step is accomplished, we envision creating VP scenarios that include not only a set of symptoms and history timeline but also empathic challenges that are characteristic to real-world patient encounters. Virtual patients already allow content standardization, repetitive practice, and the opportunity to practice critical scenario interviews safely. Adding an empathy dimension to VPs will make this technology more adept to be integrated in health care professions’ curricula.
We demonstrated earlier that medical students can show empathy in an interaction with a VP.20–22 In this study, we focused on the verbal manifestation of empathy. We acknowledge that verbal empathy is only one piece of the puzzle and beyond expert rating, we do not know how it is related to the complex process that shapes the patient-physician relationship. In addition, it is not known whether the students’ performance in an SP interview is predictive of clinical behavior because we did not assess the students’ empathy further along in their medical training. We performed the study at only one institution with a small sample of students. We did not explore the students’ satisfaction with the SP interaction, although such assessment could further validate our teaching intervention. Not having used a self-rated measure of empathy with separate dimensions that address cognitive and emotional components of empathy38 at the onset of the study leaves us without a correlation among the expert-rated, patient-rated, and self-rated evaluation of empathy. In addition, as suggested in the student feedback, the addition of voice recognition for the VP would be desirable. Finally, as in all cases where multiple independent outcomes are measured, some of the differences found could have been attributable to random variation.
Conclusions and Future Directions
An enhancement that offers immediate feedback on empathy in a VP interaction increases students’ empathy in encounters with real people, as coded by trained assessors, whereas the backstory VP does not show the same effect. The empathy-feedback VP seems promising as an educational tool that can impact students’ empathy as rated by trained assessors and SPs. Virtual patients could represent a tool to teach and reinforce empathic communication, along a continuum of other interventions for empathy education. We offered, with relatively low cost, proof of concept that an empathy-feedback VP can change medical students’ behavior in an SP interview. Work is currently being performed to allow future systems to automatically code empathetic statements typed by the VP user on the ECCS scale.
1. Hojat M, Louis DZ, Markham FW, Wender R, Rabinowitz C, Gonella JS. Physicians’ empathy
and outcomes for diabetic patients. Acad Med 2011;86:359–364.
2. Rakel D, Barrett B, Zhang Z, et al. Perception of empathy
in the therapeutic encounter: effects on the common cold. Patient Educ Couns 2011;85:390–397.
3. Stepien KA, Baernstein A. Educating for empathy
. A review. J Gen Intern Med 2006;21:524–530.
4. Batt-Rawden SA, Chisolm MS, Anton B, Flickinger TE. Teaching empathy
to medical students: an updated, systematic review. Acad Med 2013;88:1171–1177.
5. Charon R. The patient-physician relationship. Narrative medicine. A model for empathy
, reflection, profession and trust. JAMA 2001;286(15):1897–1902.
6. Chen D, Lew R, Hershman W, Orlander J. A cross-sectional measurement of medical student empathy
. J Gen Intern Med 2007;22(10):1434–1438.
7. Newton BW, Savidge MA, Barber L, et al. Differences in medical students’ empathy
. Acad Med 2000;75:1215.
8. Hojat M, Vergare MJ, Maxwell K, et al. The devil is in the third year: a longitudinal study of erosion of empathy
in medical school. Acad Med 2009;84:1182–1191.
9. Bylund CL, Makoul G. Empathic communication
and gender in the physician-patient encounter. Patient Educ Couns 2002;48:207–216.
10. Hojat M, Gonnela JS, Nasca TJ, Mangione S, Veloski JJ, Magee M. The Jefferson Scale of Physician Empathy
: further psychometric data and differences by gender and specialty at item level. Acad Med 2002;77(Suppl 10):558–560.
11. DiGioia A, Greenhouse P. Patient and family shadowing: creating urgency for change. J Nurs Adm 2011;41:23–28.
12. DiGioia A 3rd, Lorenz H, Greenhouse PK, Bertoty DA, Rocks SD. A patient-centered model to improve metrics without cost increase: viewing all care through the eyes of patients and families. J Nurs Adm 2010;40(12):540–546.
13. Henry-Tillman R, Deloney LA, Savidge M, Graham CJ, Klimberg VS. The medical student as patient navigator as an approach to teaching empathy
. Am J Surg 2002;183:659–662.
14. Wilkes M, Milgrom E, Hoffman JR. Towards more empathic medical students: a medical student hospitalization experience. Med Educ 2002;36:528–533.
15. Moulton CA, Tabak D, Kneebone R, Nestel D, MacRae H, LeBlanc VR. Teaching communication skills using the integrated procedural performance instrument (IPPI): a randomized controlled trial. Am J Surg 2009;197(1):113–118.
16. Stevens A, Hernandez J, Johnsen K, et al. The use of virtual patients to teach medical students history taking and communication skills. Am J Surg 2006;191:806–811.
17. Shah H, Rossen B, Lok B, Londino D, Lind SD, Foster A. Interactive virtual-patient scenarios: an evolving tool in psychiatric education. Acad Psychiatry 2012;36:146–150.
18. Foster A, Chaudhary N, Murphy J, Lok B, Waller J, Buckley PF. The use of simulation to teach suicide risk assessment to health profession trainees-rationale, methodology, and a proof of concept demonstration with a virtual patient
. Acad Psychiatry 2015;39(6):620–629.
20. Kleinsmith A, Rivera-Gutierrez D, Finney G, Cendan J, Lok B. Understanding empathy
training with virtual patients. Comput Human Behav 2015;52:151–158.
21. Deladisma AM, Cohen M, Stevens A, et al. Do medical students respond empathetically to a virtual patient
? Am J Surg 2007;193:756–760.
22. Foster A, Harms J, Ange B, et al. Empathic communication
in medical students’ interactions with mental health virtual patient
scenarios: a descriptive study using the Empathic Communication
Coding System. Austin J Psychiatry Behav Sci 2014;1(3):6.
23. Borish M, Cordar A, Foster A, Kim T, Murphy J, Lok B. Utilizing real-time human-assisted virtual humans to increase real-world interaction empathy
. Kansei Engineering & Emotion
Research (KEER’14) 2014:15.
24. McClendon JL, Mack NA, Hodges LF. The use of paraphrase identification in the retrieval of appropriate responses for script based conversational agents. Proceedings of the Twenty-Seventh International Florida Artificial Intelligence Research Society Conference. 2014.
25. Kittur A, Chi EH, Suh B. Crowdsourcing user studies with Mechanical Turk. New York, NY: Proceedings of the SIGCHI conference on human factors in computing systems, ACM; 2008:453–456.
26. Rossen B, Lind S, Lok B. Human-centered distributed conversational modeling: efficient modeling of robust virtual human conversations. In: Intelligent Virtual Agents. Berlin, Germany: Springer; 2009:474–481.
27. Cordar A, Borish M, Foster A, Lok B. Building virtual humans with back stories: training interpersonal communication skills in medical students. In: Bickmore T, et al., eds. Intelligent Virtual Agents 2014, LNAI 8637. Switzerland: Springer International Publishing; 2014;144–153.
28. Holmes EP, Corrigan PW, Williams P, Canar J, Kubiak M. Changing attitudes about schizophrenia. Schizophr Bull 1999;25(3):447–456.
29. Bylund CL, Makoul G. Examining empathy
in medical encounters: an observational study using the Empathic Communication
Coding System. Health Commun 2005;18(2):123–140.
30. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33:159–174.
31. Goldenberg M, Hamaoka D, Money K, Flanagan Risdal A, Darby E. Standardized patient case: a troubled soldier. MedEdPORTAL 2011. Available at: www.mededportal.org/publication/8597
32. DSM-IV-TR. Diagnostic and Statistical Manual of Mental Disorders, Text Revision. Arlington, VA: American Psychiatric Association; 2000.
33. Fallucco MD, Hanson MD, Glowinski AL. Teaching pediatric residents to assess adolescent suicide risk with a standardized patient module. Pediatrics 2010;125(5):953–959.
34. Black AE, Church M. Assessing medical student effectiveness from the psychiatric patient’s perspective: The Medical Student Interviewing Performance Questionnaire. Med Educ 1998;32(5):472–478.
36. Wind LA, Van Dalen J, Muijtjens AM, et al. Assessing simulated patients in an educational setting: the MaSP (Maastricht Assessment of Simulated Patients). Med Educ 2004;38(1):39–44.
37. Goodchild CE, Skinner TE, Parkin T. The value of empathy
in dietetic consultations. A pilot study to investigate its effect on satisfaction, autonomy and agreement. J Hum Nutr Diet 2005;18:181–185.
38. Davis MH. Measuring individual differences in empathy
: evidence for a multidimensional approach. J Pers Soc Psychol 1983;44(1):113–126.