Over the past decade, advanced medical simulation in anesthesia has developed to a point where it is now “poised to become ubiquitous” in teaching curricula (1) and in team crisis resource management (2,3).
Another possible application for full-scale simulation is in the evaluation and accreditation of anesthesia trainees and graduates. Full-scale simulation has been used to assess medical student performance in anesthesia-related scenarios (4), to detect gaps in medical student knowledge in anesthesia (5,6), to observe and quantify technical performance of novice anesthesiologists (7), and in the process of evaluating physicians with lapsed medical skills (8). Other investigators have conducted studies aimed at development of valid, reliable, and realistic simulation-based evaluation methods (9–13). One such multi-institutional study was performed in 10 centers and involved 99 anesthesia residents (14). Another aspect of evaluation is the development of a behavioral rating system (15) by researchers interested not only in commonly assessed technical skills but also in behavioral performance assessment.
Despite the efforts involved in the development of valid and reliable simulation-based evaluation tools, a recent review concluded that development in this area has not yet progressed enough to justify its use in formal, summative evaluation of competence in anesthesia (16). Moreover, a recent international survey found that only 7%–14% of simulation centers are using these methods for evaluation of competence. The suggested reasons for this apparent under-utilization include the lack of research in this area, and the lack of standardized, valid, and reliable tests (17). An accompanying editorial to that article, entitled “Anesthesiology Simulators: Networking Is the Key,” expressed the belief that communication and collaboration among centers involved in simulator programs is of paramount importance to the future of this technology (18).
The combined effort of the anesthesia simulation community is needed to overcome the obstacles of using simulation for testing and evaluation. To lay the groundwork for collaboration among >150 anesthesia simulation centers worldwide, it would be helpful, given the differences in language, education, anesthesia practice, and health care delivery systems, to delineate the results of simulation-based evaluation in different countries.
We present the adaptation of simulation-based evaluation tools, developed in the United States (US) and published by Schwid et al. (14), in academic Israeli anesthesia departments, highlighting issues relevant to the possible sharing of scenarios and evaluation tools in different countries. The hypothesis of this study was that simulation-based evaluation tools used and published elsewhere are applicable to the Israeli setting.
After informed consent allowing videotaping of performance, 31 anesthesia residents participated in the training conducted at the Israeli Center for Medical Simulation (M.S.R.) located at the Sheba Medical Center, Tel Hashomer, Israel. Participants were at least 6 mo into their residency training but had not passed the National mid-residency examination. Before starting the simulator session, residents were instructed to manage the patient as they would in the operating room (OR) and to verbalize all observations, possible problems, and treatments administered.
Four simulation scenarios and grading forms used in this study were developed and used in a prior study by Schwid et al. (14). The human patient simulator from METI (Sarasota, FL), located in a fully equipped simulated OR, was used as the simulation platform in the present study.
Scenario 1: Esophageal Intubation
The anesthesia residents allowed a paramedic to intubate a patient’s trachea after induction of general anesthesia. According to a preconfigured standard script, the paramedic performed the laryngoscopy, reported visualization of the vocal cords, and then proceeded to intubate the esophagus while affirming that the endotracheal tube was placed correctly. Then, physiologic signs of esophageal intubation, including lack of breath sounds upon auscultation, increased airway pressure, absent exhaled carbon dioxide on the capnograph, and eventually decreased arterial oxygen saturation, were presented by the anesthesia simulator. The grading criteria for this scenario included the correct diagnosis of the problem as reported by the resident and the time taken to reestablish ventilation.
Scenario 2: Anaphylactic Reaction
A few minutes after tracheal intubation, the surgeon requested the administration of an antibiotic. At the same time, the anesthesia resident administered a muscle relaxant for surgical relaxation. An anaphylactic reaction was triggered, with an increase in heart rate to120 bpm and a decrease in systolic blood pressure to 50–60 mm Hg, which was refractory to treatment with ephedrine and phenylephrine. The participant was informed that a rash was present if he or she inquired. The participant was given 15 min to diagnose and treat the anaphylactic reaction and grading was focused mainly on making the correct diagnosis, and appropriate and timely administration of IV fluids and epinephrine.
Scenario 3: Chronic Obstructive Pulmonary Disease (COPD) Exacerbation
After the previous scenario, the simulator was reset, and the participant induced anesthesia for a second patient with preexisting COPD. Shortly after intubation, bronchospasm was triggered with high peak airway pressure, decreased tidal volume, carbon dioxide retention, and decreased arterial oxygen saturation. Grading criteria were: consideration of differential diagnosis for bronchospasm and difficult ventilation, appropriate administration of bronchodilators and of an increased concentration of inhaled anesthetic. The bronchospasm lasted 15 min or until the resident administered bronchodilators twice.
Scenario 4: Myocardial Ischemia
A few minutes after resolution of the bronchospasm, ST depression on electrocardiogram, tachycardia, and hypotension developed. Grading criteria for the resident’s performance were: IV administration of vasopressors and fluids to increase arterial blood pressure, decreasing the inhaled anesthetic and administration of narcotics, use of β-blockers to decrease heart rate, and appropriate titration of nitroglycerin. The participant was graded for selection of therapeutic drugs, doses administered, and timing of administration.
The simulation training sessions were videotaped using digital video recordings. Three cameras (PELCO, US), one of them a PTZ (pan-tilt zoom) camera, connected to a digital recording system (DARIM, Korea), were used. A four-quadrant screen that included two separate views of the participants and the mannequin and one screen demonstrating the patients’ vital signs monitoring system were used. Performance was assessed by two senior anesthesiologists who reviewed the videotapes separately and independently, using two scoring systems described by Schwid et al. (14). The first “Long Form” grading system had 108 assessment points, with many points determined by the participant’s verbalization of observations. The second “Short Form” grading system had 40 points, and was based mainly on therapeutic actions that would directly benefit the patient with no points for observations or differential diagnosis (both forms are presented in Appendix 1). It should be noted that, with the Short Form, it was possible to earn multiple points for a single action depending on the timing of action. The reviewers were also asked to report on common mistakes during management of the simulated critical incidents.
Subjective Assessment by Participants
After the simulation session, the participants globally rated the realism of the scenarios on a graded scale (4 = very realistic, 3 = somewhat realistic, 2 = somewhat unrealistic, 1 = very unrealistic).
Evaluation of the Grading System by Senior Anesthesiologists
To assess the relevance of the checklists developed for the US multi-institutional study to local medical practice, 15 senior anesthesiologists from 2 different hospitals in Israel were asked to mark each of the checklists items used for evaluation on a 1–5 scale (1 = irrelevant at all; 5 = very relevant). The reviewers were encouraged to add items to the checklist if needed. After the first session, the mean results were presented to the 15 senior anesthesiologists for a second opinion. In this second review, checklist items that achieved a score of 1 or 2 were excluded, and new items achieving a score of 4 or 5 were included in a revised version of the grading system used for a second reliability assessment of the simulator scenarios.
Interrater reliability was measured using Pearson correlation for the two evaluators and the average score for the two raters was used for the remainder of the statistics. Reliability of the simulator evaluation was assessed by the Cronbach α statistic for internal consistency.
All participants performed the four study scenarios. Twenty-five of the 31 participants (80%) globally rated the scenarios as very realistic (grade 4 on a 1–4 scale); the other 6 participants rated them as somewhat realistic (grade 3 on a 1–4 scale).
Subjects scored from 37 to 95 (70 ± 12) of 108 possible points with the Long Form (data presented as percentage of the maximal score in Table 1). Short Form scores ranged from 18 to 35 (28.2 ± 4.5) of 40 possible points (data presented as percentage of the maximal score in Table 1).
The Israeli senior anesthesiologists recommended deleting four checklist items from the assessment forms (using ketamine or lidocaine in the treatment of bronchospasm, points for establishing ventilation >5 min after esophageal intubation and points for nitroglycerine administration >8 min after the beginning of myocardial ischemia). Two new items (changes in mechanical ventilation and sending arterial blood gases for analysis in the management of bronchospasm) were recommended.
Reliability of the original simulator assessment is demonstrated by 0.66 internal consistencies for the Long Form and 0.75 for the Short Form as measured by the Cronbach α statistic. The reliability after adaptation of the assessment tools according to the recommendations of the Israeli experts was 0.67 and 0.76, respectively. Interrater reliability measured by Pearson correlation was 0.91 for the Long Form and 0.96 for the Short Form (P < 0.01).
Scenario 1: Esophageal Intubation.
All 31 participants diagnosed and treated the esophageal intubation, 29 of them reestablished ventilation in <5 min and the other 2 within 3 more minutes. Only 12 participants performed laryngoscopy for diagnosis before taking the esophageal tube out. Only six participants left the esophageal tube in place until another tube was placed into the trachea, or pulled the tube while cricoid pressure was applied. Only five participants placed an oro/nasogastric tube to evacuate the stomach.
Scenario 2: Anaphylactic Reaction.
Five participants (16%) did not make the correct diagnosis. Four participants maintained that the tachycardia and hypotension were induced by preoperative fluid deficit and one participant misinterpreted sinus tachycardia as supraventricular tachycardia and attempted to apply electrical cardioversion (data on some common mistakes are presented in Table 2).
In this scenario, treatment of hypotension was essential, regardless whether or not the diagnosis of anaphylaxis was made. All participants but one increased fluid administration. Most of them (71%) used a pressurized infusion device and started a second IV line (64%). In addition to appropriate administration of IV fluids during anaphylaxis-induced hypotension, 90% of participants decreased the inhaled anesthetic and 84% administered epinephrine. One participant administered a 1-mg IV epinephrine bolus, a potentially arrhythmogenic dose.
Scenario 3: COPD Exacerbation.
All participants diagnosed bronchospasm correctly, but not all of them presented a comprehensive differential diagnosis. During patient assessment, 16 (52%) of the participants did not try manual, bag ventilation and 22 (71%) did not pass a suction catheter or perform fiberoptic bronchoscopy through the endotracheal tube. All participants administered a bronchodilator and most of them (90%) increased the concentration of the inhaled anesthetic to promote bronchodilation (in the multi-institutional study, 92% administered a bronchodilator and 52% increased the concentration of the inhaled anesthetic). Only 10 participants (33%) gave a bolus of a neuromuscular blocking drug, and no participant administered ketamine or lidocaine. Changes in mechanical ventilation setting were performed by 26 participants (84%), and 6 participants (19%) asked for arterial blood gases.
Scenario 4: Myocardial Ischemia.
In the assessment of the patient during this scenario, none of the 31 participants examined the skin. Errors in the treatment of myocardial ischemia included omission of analgesia (52%), omission of administration of fluids (48%), omission of vasopressors (16%), omission of a β1 blocker to slow the heart rate (23%), and omission of nitroglycerin (13%) (Table 3). The initial doses of nitroglycerin as well as the titration to a target dose were correct in most of the cases.
The aim of the present study, conducted in Israel, was to assess the feasibility of using simulation-based evaluation tools developed in the US, despite differences in anesthesia practice and health care delivery systems in those two countries. The high scores for plausibility given to the scenarios by the Israeli residents participating in the study, and the high rate of agreement with the checklist items by Israeli senior anesthesiologists, suggest that these evaluation tools can be successfully applied in the two different settings. The reliability of the original US assessment tool in the Israeli study population was demonstrated by an internal consistency of 0.66 for the Long Form and 0.75 for the Short Form, as measured by the Cronbach α statistic. These findings are similar to the values found in the original US population with 0.72–0.76 internal consistencies for the Long Form and 0.71–0.75 for the Short Form. The interrater reliability of both the Israeli and the US raters was high, even though the Israeli raters knew the subjects and the US raters did not, because they were evaluated by raters at a different institution. This high interrater reliability might have been related to the fact that evaluation consists of a Yes/No predefined checklist.
Although data on mistakes performed during training in the present study are presented in comparison to the data from the previous US study, our goal was not to compare the performance of Israeli and US residents. Such a comparison is difficult because postgraduate training year (CB, CA 1–3), although representing a well defined progression of knowledge and skill acquisition in the US, is not easily equated to the training of Israeli residents. Most Israeli residents are older and more experienced than their US counterparts because many were previously trained and certified as medical doctors in the former Soviet Union, and some were in anesthesia practice for several years before moving to Israel. Because of significant differences between the practice of medicine in Israeli and Eastern European systems, they are required to repeat their training. The differences in the medical background and experience of participants may help explain the higher scores of Israeli junior residents in the present study in comparison to the US residents in both Long and Short Form assessment scores. This may also account for the less frequent incidence of mistakes among Israeli residents.
Some items in the scenarios emphasize the differences in anesthesia practice between the US departments involved in the multicenter study and the departments participating in the Israeli study. Although no difference between the two studies in the response to esophageal intubation was found, only six Israeli participants left the esophageal tube in place until another endotracheal tube was correctly placed into the trachea, or removed the tube while cricoid pressure was applied, to prevent aspiration.
During the treatment of COPD exacerbation, none of the Israeli participants gave ketamine or lidocaine, and most participants changed the variables of mechanical ventilation—an action not requested at all in the original US assessment forms. The disagreements in treatment protocols were highlighted by the evaluation of the grading system by senior Israeli anesthesiologists. Some of the 15 evaluators maintained that airway protection after esophageal intubation is not indicated and most supported the exclusion of lidocaine or ketamine treatment from the checklists.
Other common mistakes performed by the Israeli residents were not accepted by the Israeli senior anesthesiologists as standard of practice. The mistakes of not trying manual bag ventilation and not passing a suction catheter or fiberoptic bronchoscopy through the endotracheal tube in the assessment of COPD exacerbation might represent systemic problems in the training of anesthesiology residents and not a true difference in anesthesia practice.
Based on the subjective opinion of the participants and agreement among experts on performance assessment checklist items, we were able to demonstrate that simulation-based scenarios can be shared between simulation-based anesthesia training centers in two different countries. In our study, the reliability of the assessment tools was slightly lower and adaptation using local experts’ opinions failed to improve it. Because of differences in training systems and the lack of routine scored trainee evaluation, we were unable to assess construct validity of the scenarios or to compare the scores achieved in the simulation-based evaluation with departmental evaluation tools. In addition, evaluation outcome differences between the two populations of residents cannot be directly compared because of the confounding differences in practice patterns and level of resident training and experience.
Further prospective evaluation of scenarios should be part of a joint effort of the medical simulation community. Such evaluation should emphasize not only the evaluation of technical skills but also communication or leadership skills (19).
Long Form Scoring System
Case 1: Esophageal Intubation Followed by Anaphylaxis
Paramedic student performs esophageal intubation.
- - Lack of CO2 communicated
- - O2 saturation communicated
- - Breath sounds auscultated
- - Stomach auscultated
- - Laryngoscopy for diagnosis
- - Notify team that must reintubate
Elapsed time to reestablish ventilation: <2 min, 8 pts; 2–5 min, 5 pts; 5–8 min, 3 pts; >8 min, 0 pts—according to the Israeli experts, 0 pts were given if ventilation was reestablished after 5–8 min.
- - Esophageal tube left in place until endotracheal tube placed, or tube pulled and cricoid pressure used
- - Gastric tube placed to evacuate stomach
A few minutes after esophageal intubation is corrected, surgeon asks for antibiotic and complains about relaxation. Anaphylaxis without bronchospasm is triggered, heart rate (HR) increases to 120, and arterial blood pressure (BP) decreases to 50–60 systolic.
- - Resident communicates tachycardia
- - BP is rechecked before treating HR
- - Checks breath sounds
- - Airway pressure is communicated
- - Checks skin color—tell that flushed
- - O2 saturation communicated
- - Trendelenburg
- - Pressor other than epinephrine (epi)
- - Notifies surgeon that there is a problem
Elapsed time to turn off anesthetic: <5 min, 3 pts; 5–8 min, 2 pts; 8–12 min, 0 pts; >12 min or not done, −2 pts.
- - 100% O2
- Elapsed time to increase fluids: <5 min, 5 pts; 5–8 min, 3 pts; 8–12 min, 0 pts; >12 min or not done, −4 pts.
- - Pressure bag (2 pts)
- - Asks for second IV line (2 pts)
- Elapsed time to administer epi: <5 min, 5 pts; 5–8 min, 3 pts; 8–12 min, 0 pts; >12 min, −4 pts; no epi, −5 pts.
- <20 μg, 1 pt; 20–200 μg, 3 pts; 201–500 μg, 2 pts; 501–999 μg or none, 0 pts; 1000 μg or more, −5 pts.
- - Second bolus of epi
- - Calls for help
- - Informs surgeon that possible anaphylaxis
- - Blood gas ordered
- - H1 blocker
- - H2 blocker
Case 2: Bronchospasm Followed by Myocardial Ischemia
- - Change in airway pressure communicated
- - Change in tidal volume communicated
- - Change in capnogram communicated
- - Listen to breath sounds
- - O2 saturation is communicated
- - Try bag ventilation
- - Notify team of problem
- - Check depth of endotracheal tube (ETT)
- - Look for ETT kink
- - Pass suction catheter or fiberoptic bronchoscopy (3 pts)
- - None of the diagnostic maneuvers (−3 pts)
- - Increase inhaled anesthetic (2 pts)
- - Decrease inhaled anesthetic (−2 pts)
- - Inhaler-appropriate drug (2 pts)
- - Inhaler-appropriate dose (2 pts)
- - Inhaler-used spacer correctly
- - Administer neuromuscular blockade (NMB)
- - Ketamine—excluded by the Israeli experts
- - Lidocaine—excluded by the Israeli experts
- - Changes in mechanical ventilation—included by the Israeli experts
- - Arterial blood gases measurement—included by the Israeli experts
- - Inhaler repeated appropriately
A few minutes after bronchospasm is corrected, the patient develops ST segment depression, frequent premature ventricular contractions (PVCs), hypotension, and tachycardia.
- - ST changes communicated
- - PVCs communicated
- - Tachycardia communicated
- - Hypotension communicated
- - Checks breath sounds
- - Checks airway pressure
- - Checks skin color
- - O2 saturation is communicated
Elapsed time to administer pressor: <8 min, 5 pts; 8–12 min, 3 pts; >12 min, 0 pts; no pressor, −4 pts.
- - Pressor-phenylephrine (3 pts) or ephedrine (2 pts) or dopamine (2 pts)
- - Increases fluids
- - Decreases inhalation agent
Elapsed time to administer nitroglycerin (NTG): <8 min, 5 pts; 8–12 min, 3 pts; >12 min, 0 pts; NTG not used, −4 pts—according to the Israeli experts, 0 pts were given if NTG was administered after 8–12 min.
Start at 0.25–0.5 μg · kg−1 · min−1 (+2 pts); titrate up to 1–2 μg · kg−1 · min−1 (+2 pts)
- Start at ≫1 μg · kg−1 · min−1 (−2 pts)
- Provide adequate analgesia: morphine-fentanyl (3 pts)
- Slow HR: esmolol (2 pts) or other β blocker (1 pt)
- - Treat PVCs: lidocaine
Short Form Scoring System
Case 1: Esophageal Intubation Followed by Anaphylaxis
- 1. Time to reestablish ventilation <8 min—according to the Israeli experts, no points are given
- 2. Time to reestablish ventilation <5 min
- 3. Time to reestablish ventilation <2 min
- 4. Airway protected during reintubation
- 5. Stomach emptied after reintubation
- 6. Trendelenburg
- 7. Time to increase fluids <12 min
- 8. Time to increase fluids <8 min
- 9. Time to increase fluids <5 min
- 10. Use pressure bag for fluids
- 11. Asks for second IV line
- 12. Time to administer epi <12 min
- 13. Time to administer epi <8 min
- 14. Time to administer epi <5 min
- 15. Initial epi administered ≪500 μg
- 16. 20 μg ≪ (initial epi administered) ≪ 200 μg
- 17. 50 μg ≪ (initial epi administered) ≪ 200 μg
- 18. Second dose of epi administered
- 19. Calls for help
- 20. Informs surgeon that possible anaphylaxis
- 21. H1 blocker
- 22. H2 blocker
Case 2: Bronchospasm Followed by Myocardial Ischemia
- 23. Use appropriate inhaler
- 24. Use inhaler circuit adaptor correctly
- 25. Inhaler administration repeated
- 26. Deepen inhaled anesthetic for bronchospasm
- 27. Administer ketamine—excluded by the Israeli experts
- 28. Administer lidocaine—excluded by the Israeli experts. Changes in mechanical ventilation—included by the Israeli experts. Arterial blood gases measurement—included by the Israeli experts
Time ST segment changes begin:
- 29. Increase fluid administration rate
- 30. Decrease inhaled anesthetic
- 31. Administer pressor <12 min
- 32. Administer pressor <8 min
- 33. Pressor administered was phenylephrine
- 34. Administer NTG <12 min—according to the Israeli experts, no points are given
- 35. Administer NTG <8 min
- 36. Start NTG at 0.25–0.5 μg · kg−1 · min−1
- 37. Titrate NTG up to 1–2 μg · kg−1 · min−1
- 38. Provide adequate analgesia: morphine-fentanyl
- 39. Slow HR with esmolol or metoprolol
- 40. Administer lidocaine for PVCs
1. Seropian MA. General concepts in full scale simulation: getting started. Anesth Analg 2003;97:1695–705.
2. Blum RH, Raemer DB, Carroll JS, et al. Crisis resource management training for an anaesthesia faculty: a new approach to continuing education. Med Educ 2004;38:45–55.
3. Weller J, Wilson L, Robinson B. Survey of change in practice following simulation-based training in crisis management. Anaesthesia 2003;58:471–3.
4. Morgan PJ, Cleave-Hogg D. Evaluation of medical students’ performance using the anaesthesia simulator. Med Educ 2000;34:42–5.
5. Morgan PJ, Cleave-Hogg D, DeSousa S, Tarshis J. Identification of gaps in the achievement of undergraduate anesthesia educational objectives using high-fidelity patient simulation. Anesth Analg 2003;97:1690–4.
6. Morgan PJ, Cleave-Hogg D, DeSousa S, Tarshis J. High-fidelity patient simulation: validation of performance checklists. Br J Anaesth 2004;92:388–92.
7. Forrest FC, Taylor MA, Postlethwaite K, Aspinall R. Use of a high-fidelity simulator to develop testing of the technical performance of novice anaesthetists. Br J Anaesth 2002;88:338–44.
8. Rosenblatt MA, Abrams KJ. New York State Society of Anesthesiologists, Inc.; Committee on Continuing Medical Education and Remediation; Remediation Sub-Committee. The use of a human patient simulator in the evaluation of and development of a remedial prescription for an anesthesiologist with lapsed medical skills. Anesth Analg 2002;94:149–53.
9. Devitt JH, Kurrek MM, Cohen MM, Cleave-Hogg D. The validity of performance assessments using simulation. Anesthesiology 2001;95:36–42.
10. Weller JM, Bloch M, Young S, et al. Evaluation of high fidelity patient simulator in assessment of performance of anaesthetists. Br J Anaesth 2003;90:43–7.
11. Morgan PJ, Cleave-Hogg DM, Guest CB, Herold J. Validity and reliability of undergraduate performance assessments in an anesthesia simulator. Can J Anaesth 2001;48:225–33.
12. Devitt JH, Kurrek MM, Cohen MM, et al. Testing internal consistency and construct validity during evaluation of performance in a patient simulator. Anesth Analg 1998;86:1160–4.
13. Murray DJ, Boulet JR, Kras JF, et al. Acute care skills in anesthesia practice: a simulation based resident performance assessment. Anesthesiology 2004;101:1084–95.
14. Schwid HA, Rooke GA, Carline J, et al. Evaluation of anesthesia residents using mannequin-based simulation: a multiinstitutional study. Anesthesiology 2002;97:1434–44.
15. Gaba DM, Howard SK, Flanagan B, et al. Assessment of clinical performance during simulated crises using both technical and behavioral ratings. Anesthesiology 1998;89:8–18.
16. Wong AK. Full scale computer simulators in anesthesia training and evaluation. Can J Anaesth 2004;51:455–64.
17. Morgan PJ, Cleave-Hogg D. A worldwide survey of the use of simulation in anesthesia. Can J Anaesth 2002;49:659–62.
18. Girard M, Drolet P. Anesthesiology simulators: networking is the key. Can J Anaesth 2002;49:647–9.
19. Gaba DM. What makes a “good” anesthesiologist? Anesthesiology 2004;101:1061–2.
1M.S.R. are the initials of the simulation center in Hebrew and the meaning of the initials is “a message.”