Share this article on:

A Method for Measuring the Effectiveness of Simulation-Based Team Training for Improving Communication Skills

Blum, Richard H. MD, MSE*∥; Raemer, Daniel B. PhD†∥; Carroll, John S. PhD‡∥; Dufresne, Ronald L.§∥; Cooper, Jeffrey B. PhD†∥

doi: 10.1213/01.ANE.0000148058.64834.80
Technology, Computing, and Simulation: Research Report

Team behavior and coordination, particularly communication or team information-sharing, are critical for optimizing team performance; research in medicine generally provides no accepted method for measurement of team information-sharing. In a controlled simulator setting, we developed a technique for placing clinical information (probes) with members of a team of trainees participating in a 1-day Anesthesia Crisis Resource Management course and later tested the teams for knowledge of the probes as an indicator of overall team information-sharing. Despite the low level of team information-sharing, we demonstrated construct validity of the probe methodology by the correlation of measured change in team information-sharing from beginning to end of training with self-rated change. There was no statistical difference in “group sharing” from beginning to end of training, despite trainees’ survey responses that the course would be useful for their education and practice.

IMPLICATIONS: In a controlled simulator setting, a method for measuring communication among team members was developed using information probes placed in a clinical scenario with trainees participating in an Anesthesia Crisis Resource Management course. Knowledge of the probes was used as a measure of team information-sharing.

*Department of Anesthesia, Perioperative and Pain Medicine, Children’s Hospital Boston; †Department of Anesthesia and Critical Care, Massachusetts General Hospital; ‡Sloan School of Management, Massachusetts Institute of Technology; §The Carroll School of Management, Boston College, Boston; and ∥Center for Medical Simulation, Cambridge, Massachusetts

This work was supported by a grant from the Anesthesia Patient Safety Foundation.

Accepted for publication October 4, 2004.

Address correspondence and reprint requests to Richard H. Blum, MD, MSE, Department of Anesthesia, Perioperative and Pain Medicine, Children’s Hospital Boston, 300 Longwood Ave., Bader 3, Boston, MA 02115. Address e-mail to

Despite improvements in the safety of medical practice, human errors and system failures continue to have a substantial role in adverse outcomes in health care (1–3). Team performance has been identified as an important focus for safety and quality improvement, illustrated by efforts to train operating room (OR) teams in behaviors found useful in airplane cockpit crews (4–6). There is considerable evidence that team communication or information-sharing is one of the most important team behaviors for developing common situation assessment, overcoming groupthink, stimulating creative problem solving, learning from experiential feedback, encouraging full participation, and achieving enhanced performance (7–9).

Rigorous methods to assess communication in high performing teams in acute settings are difficult to apply because direct observation and/or videotape coding are time consuming, logistically complex, and require training and certification of observers/coders to achieve acceptable inter-rater reliability. In prior studies, the degree of inter-rater reliability typically has not been very high and it becomes even more problematic when behavioral skills, as compared with technical skills, are considered; however, some studies had better success (10–13). Our goal was to develop a new measurement technique and test its validity as applied to team communication during critical events in a highly realistic simulated OR setting.

OR teams exemplify highly interactive cross-functional groups whose performance is primarily determined by team composition and communication. Team members must have technical knowledge and skills as individuals, but also must work together by sharing information effectively (14–16). Edmondson (17) showed OR teams learned a new surgical technique more rapidly if there was a climate of openness that encouraged a free exchange of information; team leaders were essential in establishing that climate by allowing others to feel safe in bringing up information. Similarly, a study using a team-training curriculum showed information-sharing to be one of the major determinants of team performance (7).

Team performance is reduced when members fail to share unique information with other members of the team (18,19). Stasser (18) showed that team members who have some common and unique information tend to spend time discussing common information, fail to share enough unique information, and therefore make poor decisions. Real-world studies of high-level political decisions (20) and airline crashes (21,22) demonstrate that groups make poor decisions when members fail to speak assertively.

We studied teams of anesthesiology faculty or residents participating in a 1-day training program in a simulated OR setting. Individual team members were given “probes” or pieces of specific, potentially important information for patient management. The focus of the study was “team information-sharing,” defined as the extent to which team members conveyed probes to other team members and measured by the number of team members who reported knowing the probe on a post-scenario written questionnaire. The definition and measures of team information-sharing are sufficiently general to apply to a broad range of medical as well as nonmedical teams.

Because there is no accepted criterion for measuring team information-sharing, construct validity was assessed by demonstrating the relationship of the new measures to theoretically relevant variables (see Hypotheses 2 and 3 below). The hypotheses of this study were:

  1. There will be a low rate of information-sharing among team members during simulated critical events. We have observed this repeatedly in courses that are designed to improve such behaviors during critical events but it has never been quantified.
  2. Any change in measured team information-sharing from the first scenario of the day to the last will correlate with team members’ self-reported change.
  3. There will be increased information-sharing from the first scenario of the day to the last, as intended by the training.
Back to Top | Article Outline


Ethical approval was obtained from the IRBs of the institutions affiliated with this study before the enrollment of any subjects. Informed consent was obtained from all subjects from both the pilot and experimental teams, who were drawn from faculty, fellows, and residents (second or third year of training) of the four hospitals assigned to attend regularly scheduled Anesthesia Crisis Resource Management (ACRM) courses. Because there are many continuing simulation programs, the ability to quantify and meaningfully compare prior simulation experience was believed not to be practical, and therefore was not attempted. All participants in a given course composed a temporary team, resulting in teams of 3–5 members (mean = 3.9, sd 0.70). Team members were composed of trainees from all of the four affiliated hospitals. There were 22 pilot teams studied over an 8-mo period to develop the methodology and 10 experimental teams studied over a 4-mo period (7 faculty and 3 resident/fellow courses). The subjects were not given any information about the purpose or methods of the study; they were only told that we were looking at general aspects of team behaviors.

Back to Top | Article Outline


The training used a realistic simulation-based ACRM curriculum developed by Gaba et al. (4–6) and derived from Crew Resource Management training for aviation crews. Experiments were conducted during regularly scheduled ACRM courses, using a computer-controlled mannequin with human-like features and physiologic functions in a highly realistic, simulated medical environment. The anesthesiology teams performed a set of tasks that require actions essentially identical to those used during actual patient care. The remainder of the clinical personnel (surgeon, scrub nurse, technician, radiologist, etc.) were role-playing staff experienced in the domain. Scenarios involved responding to challenging clinical events. Each team received three or four training scenarios throughout the day, each followed by debriefing emphasizing crisis resource management concepts including effective communication among team members. The evolution of the anesthesia teams was natural and no attempt was made to assign specific roles or tasks to the trainees.

Back to Top | Article Outline

Pilot Teams

Twenty-two pilot team trials were conducted to develop and refine the methodology. In this developmental stage, we overcame several challenges including definition of the “team,” probe development, and appropriate wording of our questionnaires. Probes were developed by the research staff, reviewed by multiple domain experts associated with our center, then tested to identify ones that a majority of the trainees rated as important for optimal patient care. Iterative changes were made until a majority agreed that the information was important. Finally, we tested various versions and formats of post-scenario questions to assess information-sharing, and found that the questions had to be very specific to best ascertain the information we were looking for regarding information-sharing.

Back to Top | Article Outline

Experimental Teams

Teams were randomly assigned to receive one of two test scenarios, A or B, during the first session of the day and the other scenario during the final session of the day. Five teams received A first, and five received B first. The test scenarios each had four probes to be given to team members.

Scenario A was a respiratory arrest of a complex surgical patient who was 2 days post-motor vehicle collision and required a splenectomy and a cervical spine fusion. The patient had developed a fever and was brought to the computed tomography (CT) scanner to assess whether an abdominal abscess had developed. The patient’s head and neck were stabilized in a halo fixation device. The entire anesthesia team (ACRM trainees) was called to the CT scanner to manage a respiratory arrest of unknown etiology. The four probes for this scenario were that the patient was receiving a nebulizer treatment for respiratory distress, human immunodeficiency virus positive, on a morphine infusion, and had a steering wheel mark on his chest.

Scenario B was a trauma patient who had been involved in a motor vehicle collision and returned to the hospital after leaving earlier in the day against medical advice. The initial patient workup showed a possible splenic fracture. The patient was taken to the OR for exploratory laparotomy after reevaluation in the emergency room. As team members arrived, the patient was in the OR with pain and distress. The four probes were: the patient had 4–5 L of crystalloid in the emergency room, a shadow on the chest radiograph, positive cocaine toxicology test, and cefoxitin had been administered on route to the OR.

Back to Top | Article Outline


Team members wore and were identified by randomly assigned color-coded nametags. During each scenario, different role-playing staff (e.g., surgeon or nurse) placed the probes by isolating one of the team members and telling or demonstrating the predefined information. To maximize our ability to track the information-sharing reliably, every effort was made to convey the information privately without giving other team members an occasion to overhear the probe. Immediately after the scenario, probe-placers documented whether they believed they were able to place the probe effectively and to whom. Each trainee was asked to complete a questionnaire about their knowledge of the four probes, the source of their information, and if they believed this information was important for optimal care of the patient (yes or no). Probe awareness was measured by specific questions (e.g., “Were you aware that the patient had a steering wheel mark on his chest”?) followed by a multiple-choice question about how they obtained the information (unaware; told or demonstrated by anesthesia team member, role-playing simulation staff, or unsure; self-discovered; and other source).

Immediately after completion of the questionnaires, trainees’ responses were reviewed in an open discussion led by the investigators. If for any reason the trainees believed their answers did not reflect the actual events, they made changes and notations on the questionnaire.

During debriefing of the scenario, a trained faculty or staff facilitator guided a self-discovery process with the aim of educating trainees about the importance of information-sharing. This included review of the questionnaire illustrating how sharing of probe information may have aided patient management and review of communication factors such as creating an open atmosphere, communicating directly and succinctly, and using repeat back (closed-loop) communication.

After the first scenario, a lecture on ACRM principles was presented, as is routine for these courses (6). During subsequent scenarios and preceding the final scenario, instructors facilitated discussions of ACRM principles, reviewed videotapes of the scenario with the team, and discussed methods to improve performance. There were no probes introduced into the nonexperimental scenarios.

During the final scenario of the day, four probes were placed. The role-playing staff members and the individual trainees completed questionnaires as in the first scenario.

At the completion of the course, trainees completed a questionnaire to assess their perception of changes in information-sharing from the first to the last scenario, and whether the communication intervention was useful to their education and future practice. Three neutrally worded questions used a 7-point response scale from significantly improved through significantly deteriorated (3 = significantly improved; 2 = improved; 1 = somewhat improved; 0 = no change; −1 = somewhat deteriorated; −2 = deteriorated; −3 = significantly deteriorated). The average score from all the team members’ responses was used as the self-perceived change score.

Back to Top | Article Outline

Success of Probe Placement

Because of the complex team responses and behaviors surrounding the teams’ care of the acute cases, it was important that we study only those probes successfully placed with only one anesthesia team member. The questionnaire data at times had conflicting information about the flow of information. The rules for successful probe placement were:

  1. The probe placer was confident that the probe was placed with one team member and had not been overheard, and 1. Team member was confident they received the information from the probe placer, or 2. Team member was unsure from whom they received it, but no other team member reported overhearing the probe.
  2. The probe placer was confident that the probe was given to one team member but unsure whether it had been overheard, and that team member was sure they had received the information from the probe placer, but no other team member reported overhearing the probe.

If two team members reported receiving the probe information from a probe placer, even if the probe placer was confident it was not overheard, the probe was considered not properly placed (possibly overheard).

Back to Top | Article Outline


To measure if probe information was shared with other team members, we defined Group Sharing as the percentage of probe information shared relative to the maximum possible information-sharing, counting only those probes that had been properly placed. For example, if 3 of the 4 probes were properly placed with 4 team members, the maximum amount of sharing was 9 (each probe could be shared with up to 3 others, times 3 probes).

Back to Top | Article Outline


Of the 80 probe placements (10 teams, 2 scenarios per team, 4 probes per scenario), 46 were successful (56%, see Table 1 for team data). Twenty-six (33%) were overheard or possibly overheard, 6 (8%) were not placed or possibly not placed, and 2 (3%) were uncertain. There were no statistical differences for successful probe placement for scenario, order, or subject type using table analysis (P values all > 0.2). Very rarely did trainees change their mind after discussion as to whether they had been told or demonstrated probe information, had discovered the information themselves, or did not know the information (3 of 176 responses or 1.7%).

Table 1

Table 1

All 8 probes were rated as important for optimal care by a majority of participants (average 85%, sd 13%, range 58%–100%). Only being human immunodeficiency virus positive in the CT scan scenario was rated <75%.

For the Group Sharing measure, the average team information-sharing was 27% (Table 1). There was no statistical difference between the first (28%) and the last scenario (26%), or between the Trauma scenario (34%) and the CT scenario (20%), although the Trauma scenario advantage approached significance (comparing the average team Trauma–CT scenario score in Table 1 against 0, t = 1.93, df = 9, P < 0.10). Of 20 scenarios within 10 trials, 15 demonstrated sharing at least 1 data probe.

Trainees evaluated their team’s communication performance comparing the first to the last scenario, reporting an average improvement score of 1.76 (between “somewhat improved” and “improved,” sd 0.94). They reported that the communication training exercises after the first and last scenario “were helpful to my understanding of communication skills” (average improvement score of 2.25, sd 0.59) and “will help improve my communication in my actual clinical setting” (average improvement score of 2.21, sd 0.54). Self-ratings of improvement during the day correlated significantly with measured change in group sharing (r = 0.65, P < 0.05; Fig. 1).

Figure 1

Figure 1

Back to Top | Article Outline


We demonstrated the plausibility of measuring team information-sharing by inserting and tracing probes in simulation-based training sessions. Although it was possible to insert the probes without revealing the focus on information-sharing or the experimental methodology, it was difficult to design probes that were salient yet not obvious and insert probes without having them overheard (approximately one-third were possibly overheard).

Through a written questionnaire, we were able to measure the degree to which information was shared among team members in a simulated critical event. Trainees frequently did not remember how or from whom they received the information probe and sometimes their recollection was logically inconsistent (e.g., a participant claimed to have been given information by another trainee who, in turn, denied knowledge of the information and had never been given the information directly). Although memory for the source of information was poor, as has been found in other studies of witness memory (23,24), information recall itself was generally reliable.

Interestingly, salient clinical information given to individuals was only shared with 27% of others, consistent with research on team communication showing team members are much more likely to share redundant, common information than to share unique information (18,19,25). This is an important finding about teamwork in health care settings, supporting Hypothesis 1, and argues for development of effective techniques to improve this critical behavior. It may be that reluctance to share information was a product of the unfamiliar simulated environment.

After training, trainees perceived improvement in information-sharing but there was no change as measured by differences in probe transmission rates from the first to last scenarios, failing to support Hypothesis 3. We cannot determine if this was caused by insensitivity of the information probe method, or an inadequacy of the training to elicit an immediate behavioral change in participants. However, differences in measured probe transmission rates did correlate with self-rated improvement in information-sharing, a demonstration of construct validity supporting Hypothesis 2. Along with its face validity, this argues that probe transmission rates measure team information-sharing.

The fact that teams did not increase information-sharing with training is not a completely unexpected finding. Complex behavioral skills, such as team communication, may be difficult to change. Improving team communication may require more time for reflection, training at work, or more intensive simulator training.

Future research should be aimed at improving this methodology and continued measurement of validity and reliability. Given that information transfer was small, future investigators should consider the use of more obvious probes to maximize information-sharing. Potential ways to measure construct validity include: 1. Improving the intervention, either in the one-day training model or by attempting a more complex longitudinal intervention using some combination of the simulated environment, the real clinical setting, and other training techniques such as web-based educational programs; 2. Comparing intact teams that are known to differ in teamwork quality; and 3. Comparison to other measures that have yet to be completely validated such as videotape or real-time observation by expert raters. An additional area of research, implicated by the extremely positive response to this methodology on the post-course questionnaire, is how to use the probe methodology as a training tool to improve team information-sharing.

The use of planted probes to test for information-sharing seems to be a potentially viable research tool for assessing team communication and performance, although it is in need of further development. We intend to further develop training methods to improve medical team information-sharing skills that are essential for optimal team performance and to continue to assess and improve our ability to measure changes in team performance.

The authors acknowledge the significant effort and support of the staff and faculty of the Center for Medical Simulation for this study.

Back to Top | Article Outline


1. Kohn L, Corrigan J, Donaldson M. To err is human: building a safer health system. Washington, DC: National Academy Press, 2000.
2. Leape LL, Brennan TA, Laird N, et al. The nature of adverse events in hospitalized patients: results of the Harvard Medical Practice Study II. N Engl J Med 1991;324:377–84.
3. Starfield B. Is US health really the best in the world? JAMA 2000;284:483–5.
4. Howard SK, Gaba DM, Fish KJ, et al. Anesthesia crisis resource management training: teaching anesthesiologists to handle critical incidents. Aviat Space Environ Med 1992;63:763–70.
5. Gaba DM, Fish KJ, Howard SK. Crisis management in anesthesiology. New York: Churchill Livingstone, 1994.
6. Holzman RS, Cooper JB, Gaba DM, et al. Anesthesia crisis resource management: real-life simulation training in operating room crises. J Clin Anesth 1995;7:675–87.
7. Morey JC, Simon R, Jay GD, et al. Error reduction and performance improvement in the emergency department through formal teamwork training: evaluation results of the MedTeams project. Health Serv Res 2002;37:1553–81.
8. Edmondson A, Bohmer R. Disrupted routines: team learning and new technology implementation in hospitals. Adm Sci Q 2001;46:685–716.
9. Hackman JR. Groups that work (and those that don’t). San Francisco: Jossey-Bass, 1989.
10. Gaba DM, Howard SK, Flanagan B, et al. Assessment of clinical performance during simulated crises using both technical and behavioral ratings. Anesthesiology 1998;89:8–18.
11. Devitt JH, Kurrek MM, Cohen MM, Cleave-Hogg D. The validity of performance assessments using simulation. Anesthesiology 2001;95:36–42.
12. Forrest FC, Taylor MA, Postlethwaite K, Aspinall R. Use of a high-fidelity simulator to develop testing of the technical performance of novice anaesthetists. Br J Anaesth 2002;88:338–44.
13. Schwid HA, Rooke GA, Carline J, et al. Evaluation of anesthesia residents using mannequin-based simulation: a multi-institutional study. Anesthesiology 2002;97:1434–44.
14. Helmreich RL, Schaefer H. Team performance in the operating room. In: Bogner M, ed. Human error in medicine. Hillsdale, NJ: Lawrence Erlbaum Associates Publishers, 1994:225–53.
15. Bowers CA, Braun CC, Kline PB. Communication and team situational awareness. In: Gilson RD, Garland DJ, Koonce JM, eds. Situational awareness in complex systems: proceedings of a CAHFA conference. Daytona Beach, FL: Embry-Riddle Aeronautical University Press, 1994:305–11.
16. Weick KE, Sutcliffe KM, Obstfeld D. Organizing for high reliability: processes of collective mindfulness. Res Org Behav 1999;21:81–123.
17. Edmondson A. Psychological safety and learning behavior in work teams. Adm Sci Q 1999;44:350–83.
18. Stasser G. Pooling of unshared information during group discussion. In: Worchel S, Wood W, Simpson JA, eds. Group process and productivity. Newbury Park, CA: Sage, 1992:48–67.
19. Wittenbaum GM, Stasser G. Management of information in small groups. In: Nye JL, Brower AM, eds. What’s social about social cognition. Newbury Park, CA: Sage, 1996:3–28.
20. Janis IL. Groupthink: psychological studies of policy decisions and fiascoes. 2nd ed. Boston: Houghton Mifflin, 1982.
21. Helmreich RL, Foushee HC. Why crew resource management? Empirical and theoretical bases of human factors training in aviation. In: Weiner EL, Kanki BG, Helmreich RL, eds. Cockpit resource management. New York: Academic Press, 1993:3–46.
22. International Civil Aviation Organization. Human factors management and organization. In: Human factors digest. Montreal, Canada, 1993.
23. Wells GL, Loftus EF. Eyewitness memory for people and events. In: Goldstein AM, ed. Handbook of psychology. Vol 11. Forensic psychology. New York: John Wiley & Sons, 2002:149–60.
24. Read JD. Understanding bystander misidentifications. In: Ross DF, Read JD, Toglia MP, eds. Adult eyewitness testimony: current trends in development. New York: Cambridge University Press, 1994:56–79.
25. Larson JR, Foster-Fishman PG, Keys CB. Discussion of shared and unshared information in decision-making groups. J Pers Soc Psychol 1994;67:446–61.
© 2005 International Anesthesia Research Society