Septic shock is the leading cause of mortality in the intensive care unit (ICU) with more than half a million new cases per year in the United States; with an aging population, this figure will only increase.1 Clinical guidelines for the management of severe sepsis are structured around interventions that have reduced its morbidity.2,3 Despite attempting to standardize the treatment of severe sepsis and shock through the dissemination of such treatment guidelines by the Society of Critical Care Medicine and others,3–5 little attention has been directed to the decision making and performance of individuals or teams in the act of managing sepsis. Likewise, the degree to which different educational strategies can facilitate uptake and deployment of best practices has not been adequately examined.
Experiential learning, considered broadly, includes activities that actively engage learners in aspects of or related to the concepts and material to be learned or their application to real-world situations. In addition to technology-based simulation, experiential learning includes role playing, problem-based learning, storytelling, encounters with standardized patients, and psychomotor task training. Generally, even when experiential learning is planned as a critical element of teaching clinical care, it is combined with traditional modalities of learning—typically a lecture (in person or by video) targeting the levels in Miller’s pyramid of “knows” or “knows how.”6 The effectiveness of a lecture in embedding knowledge can be measured with a written test (knows) but that cannot determine the degree to which such knowledge becomes “deployable” (knows how).
Because we have created high-fidelity simulations of critical illness, including septic shock, to train house staff in the management of patient care emergencies7 and later developed a scoring tool to assess the quality of teamwork and medical management in these scenarios,8 we know that statistically distinguishable categories of performance can be identified.8 Although many investigators had previously attempted to use simulation to impart knowledge to learners, we turned this paradigm around to use simulation to study whether knowledge had become deployable and useful. In this study, we make use of the scoring rubric to evaluate the impact of classroom instruction—a lecture—on performance when managing simulations of septic shock.
The intensive care clinical rotation includes a recurring lecture series, 1 day of which is replaced by a 4-hour simulation session. Because the scheduling of specific lectures varies according to the availability of faculty, the lecture on sepsis preceded the simulation of septic shock approximately half of the time. Therefore, we first performed a retrospective analysis of whether receiving the classroom lecture on septic shock before the simulation of septic shock was associated with higher performance on this scenario. We then conducted a prospective investigation of whether a redesigned lecture—one that was geared more toward improving clinical performance than to discussion of physiologic concepts—would lead to better performance on subsequent simulations of septic shock.
The experiment presented here consisted of 2 phases. In phase I, data presented in our initial report on assessing the management of septic shock simulations were analyzed in terms of whether the simulation runs that followed a lecture were associated with higher management scores. Phase II analyzed performance in simulations after a redesign of the lecture to a more deliberate focus on a widely accepted algorithm of managing septic shock.2 Cards with the printed algorithm were distributed to those attending the lecture, and a quiz was administered before and after the lecture to assess comprehension of information presented in the lecture.
Setting and Participants
The human subjects committee at Stanford University School of Medicine approved the protocol; informed consent was waived for the use of videotapes made originally for debriefing and teaching purposes. Study subjects were internal medicine interns and postgraduate year 2 to 4 residents in surgery, anesthesiology, and internal medicine who were rotating for 1 month in the medical/surgical ICU. The critical care unit is a 15-bed semiclosed unit at the Veterans Affairs Palo Alto Health Care System. Structured activities consist of 2 hours of morning multidisciplinary bedside rounds, afternoon sign-out rounds, and a lecture administered each weekday by the critical care faculty. Lectures are mandatory for interns and residents, and topics are those particular to intensive care, including management of respiratory failure, shock, sepsis, and postoperative patients. The lecture series was constant in content from month to month, and the same faculty member gave the lecture on septic shock during the duration of this experiment. In addition, a 4-hour course on critical care crisis management is conducted in a dedicated simulation center each month where standardized scenarios of septic shock and respiratory failure are managed by teams of house staff and allied personnel. Although timing of lectures sometimes varied according to faculty availability, the simulation course took place on the second Wednesday of each month. We attempted to schedule the shock lecture earlier in the month to be before the simulation session, but this could be accomplished only approximately half of the time.
In phase I, simulator performance was compared between house staff who received a lecture before the simulation with those who had the simulation first. Phase II involved an educational intervention for which a 2-part lecture on the pathophysiology of shock was changed to a single-hour clinically focused discussion of septic shock that was altered in the following ways.
- Minimized discussion of the finer physiologic and biochemical details of organ dysfunction.
- Emphasized “early goal-directed therapy” for severe sepsis and septic shock, as advocated by treatment guidelines2–4; pocket cards with an algorithm for early goal-directed therapy were distributed.
- Presented an interactive problem-based learning discussion regarding severe sepsis and septic shock, where the patient’s vital signs are projected on a screen and changed according to interventions requested by trainees.9 This “hypothetical patient” was nearly identical to the scenario encountered in the simulation, but this was never stated.
In phase II, a quiz was administered to trainees both before and immediately after the lecture and was used to evaluate: (1) the acquisition of knowledge resulting from the lecture and (2) a general approximation of the knowledge that each team had before the simulation. Participants placed a self-generated identification code on both quizzes, so they could be matched for before-and-after comparisons while assuring anonymity. No attempt was made to evaluate the impact of the time interval between the lecture and simulation.
A scoring system that quantitatively rates the management of a simulated patient with septic shock was developed and validated previously.8 The scenario called for a period in which the ICU interns were the sole physicians managing the patient (first 10 minutes) and a later period where residents directed patient management. This allowed for the formulation of management scores for both interns and residents. In addition to the main hypothesis that a revised lecture would improve performance in sepsis management, we also hypothesized that the interns—being relatively new to clinical medicine and having the most to learn—might be influenced to a greater extent by an improved educational strategy.
All participants were naive to the scenario content and were asked to preserve the novelty for others and to sign confidentiality agreements. The sepsis scenario follows a standard script of signs, symptoms, and vital signs at baseline and for patient deterioration or improvement according to clinician interventions. The scenario begins with 2 interns along with a critical care nurse who are called in as first responders. Senior residents, fellows, subspecialty physicians, respiratory therapists, and pharmacists could be summoned if desired; senior physician help was invariably requested but, by design, was kept out of the simulation room until the 10-minute point. Thus, for each scenario, 2 groups of physicians were evaluated: (1) the interns present during the first 10 minutes and (2) residents who took over the management of the scenario at the 10-minute time point.
Data Sources and Analysis
Video recordings of scenarios were analyzed by 2 trained raters, and averages of the 2 scores were used for the comparisons described. Nontechnical ratings were based on skills such as leadership, communication, contingency planning, and resource use. These skills may dictate success in dynamic environments and have been described by a number of studies in the literature.7,10–15 Previous analysis demonstrated a high degree of interrater reliability for both technical and nontechnical rating systems (κ = 0.96 and κ = 0.88, respectively).8 Comparisons between “lecture first” and “simulation first” for technical and nontechnical scores for the septic shock scenarios were made by analysis of variance (ANOVA). For all tests, P < 0.05 was taken as statistically significant.
Quizzes covering the key findings and therapy of septic shock were administered before and after the redesigned lecture in phase II; comparisons of scores used Student t test for paired samples. The association between quiz scores and simulation scores was assessed by Spearman test for correlation.
Postcourse surveys covering perceived educational value of simulation, realism, likes and dislikes, as well as suggestions for improvement were administered to participants, as is customary for all courses in our simulation center.
Temporal trends were examined to assess whether simulation scores were influenced by timing within the academic year or simply by time, as may be expected if ambient or collective knowledge were to naturally increase during the study period. Scores during different time intervals within the academic year were analyzed by ANOVA and during the course of the experimental period by Spearman test for correlation.
In the present study, we analyzed intern and group scores in relation to whether a lecture on shock was presented before or after the simulation session. A total of 56 scenarios were analyzed in this experiment; 23 were from phase I. Phase II produced 33 scoreable scenarios during a 41-month period; scenarios from 8 courses scattered throughout the study period were not scoreable when technical problems, illnesses, and other unforeseen events prevented the scenario from being conducted according to experimental standards.
Averages of all scores from phases I and II were compared and showed a statistically significant higher score in phase II (interns 6.8 vs. 9.6 and residents 8.5 vs. 11.2; P < 0.0001 for both). Because of this difference, “lecture first” and “simulation first” scores were compared separately within each phase of the experiment. Data from the “technical” (medical management) scoring system is presented for both interns and full care teams led by subsequently arriving residents. Phase I data presented in Table 1 indicate that neither technical scores nor nontechnical scores were significantly different when the simulation session was before the lecture versus after the lecture.
When comparing scores of “lecture first” versus “simulation first” groups in phase II, we again found no statistically significant benefit of the lecture but surprisingly a higher average score for groups who completed the simulation without the previous lecture (Table 2). The modest 17% improvement in score was statistically significant for only the “resident” group (not for interns). Nontechnical scores were not significantly different in phase II (Table 2, lower row of data).
Prelecture quiz scores correlated with level of training as shown in Figure 1, where significant differences were noted between medical students and all levels of house staff. Most items on the quiz were directly relevant to the elements on our technical rating scale for simulations.
Significant improvement in postlecture quiz scores was seen for subjects at all levels of training. No significant differences existed between training-level groups when postcourse test scores were compared (dark bars in Fig. 1). All groups had average scores of approximately 7.5 of 10; thus, students showed the greatest overall improvement in scores, followed by interns and residents.
To complete the inquiry into the relationship between knowledge and clinical performance, we compared the simulation scores to the covariate of quiz scores at the time of the simulation course (this score provides a rough estimate of the subject’s knowledge about sepsis). Thus, for the group participating in the simulation before the lecture, the average of their “prelecture” quiz scores was used to estimate subject knowledge, whereas the average of “postlecture” quiz scores was used to estimate knowledge of groups receiving the lectures before simulator participation. With data paired in this manner, we found no correlation between test knowledge and simulation performance. However, caution should be exercised in interpreting this finding. This analysis looked at only groups of individuals, not the actual pairing between individuals managing the sepsis scenario and their scores. Our commitment to administering quizzes with anonymity prevented us from making the more accurate pairing between quiz and video performance.
The influence of time on the scenario scores was also assessed. Figure 2 presents the simulation scores when sorted into successive quarters of the academic year and when the scenario scores are plotted sequentially during the course of the experiment. There seems to be no significant improvement during the course of the academic year for either interns or residents. When all scores throughout the experimental period are viewed in sequence, the scores generated by interns steadily decline during the 41-month period, whereas the average resident scores were unchanged.
On surveys administered at the conclusion of the simulation workshop, a mean (SD) of 90% (0.1%) recorded either “agree” or “strongly agree” to the question of whether they “felt the simulation environment and scenarios prompted realistic responses.” Responses regarding the realism of the scenarios did not differ between the groups with the highest (top 5) and lowest (bottom 5 of 23) levels of performance for both interns or for resident teams. Likewise, responses regarding realism did not differ between “simulation first” or “lecture first” groups.
Simulation techniques have created exciting opportunities for the enhancement of medical education; however, their use as a tool to evaluate aspects of traditional medical education is relatively novel. This is an important goal because the ultimate purpose of providing medical knowledge to trainee clinicians is in fact to allow them to conduct patient care effectively.
Veracity of Findings
From the survey data after simulations and voluntary comments concerning the quality of lectures, it seems that the data are a fairly credible representation of participants’ behavior. We therefore believe that the results require consideration of a number of factors related to lectures and classroom instruction including quality, context, and cognitive processing. The results also prompt consideration of our own goals for such instruction and how realistic it is to expect behavioral change from these lectures.
Could the results be explained by a lecture poor in quality or content? All items present on the technical scoring checklist were discussed in the corresponding lecture and queried on the quiz. Residents have repeatedly identified the lectures as being a high quality, relevant, and valued component of the ICU rotation. However, it is interesting that no group was able to achieve an average of 10/10 on the postlecture quiz.
Could temporal trends influence results? Our expectation that collective knowledge increases over time or increases by the number of months into internship year seems to be unfounded (Fig. 2). The latter is somewhat surprising because we assumed that there was some exposure to the management concepts under question in other rotations before an intern’s ICU rotation and that such experiences were likely to accumulate during the course of the academic year.
Is it possible that the performance seen in the simulation underrates true clinical performance? This is certainly a concern of many programs offering clinical simulation exercises and is the reason we go to great lengths to familiarize the participants with the simulator, to use the same equipment used in the ICU, and to have a “facilitator nurse” who knows where supplies are, so that participants will not be frustrated by an unfamiliar environment. What we cannot replicate are the skin, motor movements, and affects of true patients. Participants can address questions about skin color and warmth, capillary refill, pupils, neck veins, and mucus membranes to the simulation operators and receive the information via an overhead speaker. Nonetheless, we grant that some may use these and other clues to form a subconscious picture of a patient and his clinical course that may not be possible with a mannequin.
Our scoring rubric presents a list of ideal actions and behaviors, but even executing them perfectly cannot guarantee a good outcome in all cases. Similarly, the sequence of actions (such as provision of fluids before vasopressors) could be important in real life, whereas our scoring system did not distinguish between order of interventions. Although we hold the belief that simulation approximates real-life behavior, it is still true that physicians, patients, and the health care system in general probably behave differently when real lives are at stake. Although we make great efforts to recreate the “motivational structure” of the real world in simulations, it is impossible to really know whether this has been achieved.
Implications of Findings
In phase II, all trainees at multiple levels demonstrated an increase in theoretical knowledge on the quiz, so the lack of transfer to the simulated patient was completely unexpected. We assumed that the lecture would have a positive impact on the technical scores of the interns. This assumption was based on the idea that interns would have had the least experience managing critical illness, and therefore, their management of sepsis would be influenced by instruction to a degree greater than that of more senior residents. Supporting this notion is the fact that quiz scores showed the greatest improvement in lower levels of training (Fig. 1). Our finding is not a unique observation. For example, Rodgers et al16 rated performance of nursing students during simulations of cardiac arrests and found no association between simulator performance and score on previously administered tests. The present study adds to the latter work by showing a similar disconnection between acquired and deployable knowledge in multiple levels of postgraduate trainees. The degree of standardization of the scenarios for all participants and consistency of content among the lecture, quiz, simulation and rating tool are unique to this study and improve the credibility of the experimental findings.
We have no clear explanation as to why apparent uptake of clinically relevant material did not manifest itself by higher simulation performance. A number of explanations exist and may be reasonable targets for further investigation. Perhaps, the knowledge gained needs time to be processed, consolidated, and compared with existing mental models of disease management before becoming deployable. It is possible that the conditions presented to the trainees during the simulation were sufficiently stressful that a slower, more deliberate method of processing the new information was required but not easily accessible.17–19 As postcourse surveys and discussions do reveal that the trainees consider the simulations intense and challenging,20 this explanation seems plausible.
A contrasting explanation is that the day or two separating the lecture and the simulation was too great of an interval and that more immediate practice and hands-on experience with newly learned material is a requisite for reliable incorporation into practice. For example, “just in time” and “booster” training provided immediately before simulations improves the efficacy of chest compressions in simulated pediatric arrests.21,22 Unfortunately, it is hard to predict what type of patient problem will present at a given moment, so pairing training to immediate clinical need is impractical. Nonetheless, the sequence of instruction followed by guided practice may provide superior training over separation of these components and deserves further evaluation.
We started this experiment hoping to show that the content of lectures could be altered to create improvements in clinical care and ended up questioning the value of lectures in general. Lectures, both in the traditional in-person form and in the video-based “flipped classroom” paradigm, have been thought to have significant value in conveying “knows” and “knows how” information at a rather low cost. Hundreds of learners can be in the audience in person; an unlimited number might view the lecture on video. Yet, if such activities do not result in deployable knowledge and skill, this leverage is useless.
If lectures do not work, then what does? Trainees may need to be “shown the way” to a greater degree than we are willing to admit. In either event, with resident work hour limitations in place, there is an added premium on understanding the impact of educational programs on patient care and finding the most efficient means to increase the effectiveness of physicians. Future work in this area should use focused interviews and other qualitative techniques to better understand the behaviors and cognitive processes that underlie performance at the different ends of the spectrum. Educational programs should then be shaped around the factors associated with high performance. Measurement systems, such as the one presented here, may help evaluate performance of education programs and with further refinement might predict clinical outcomes. For example, some actions may prove to have greater impact than others in managing certain clinical problems, and any rating system would need to weigh these accordingly. In addition, as mentioned earlier, the sequence of actions is often important and so might need to be more closely evaluated.
This study used a standardized simulation as a surrogate for clinical performance and hypothesized that a carefully designed lecture would lead to quantitative improvements in clinical performance. Instead, we showed that performance scores of intensive care interns and residents managing a simulated case of septic shock were no better when a lecture/workshop on the same disease process preceded the simulation—despite demonstrating uptake of the lecture content. Lectures and group discussions are the de facto standard for imparting medical knowledge. Despite such status, the transfer of knowledge between lecture and bedside care has not been formally proven. Ironically, proof of efficacy is a burden placed on the proponents of simulation and immersive learning as these techniques attempt to find a niche in the instructional repertoire. It is difficult to completely discard the value of classroom instruction; however, this and future studies may prove it to be more effective as a complimentary technique rather than the sine qua non of health care education.
1. Angus DC, Linde-Zwirble WT, Lidicker J, Clermont G, Carcillo J, Pinsky MR. Epidemiology of severe sepsis in the United States: analysis of incidence, outcome, and associated costs of care. Crit Care Med
2001; 29: 1303–1310.
2. Rivers E, Nguyen B, Havstad S, et al.; Early Goal-Directed Therapy Collaborative Group. Early goal-directed therapy in the treatment of severe sepsis and septic shock. N Engl J Med
2001; 345: 1368–1377.
3. Dellinger RP, Carlet JM, Masur H, et al.; Surviving Sepsis Campaign Management Guidelines Committee. Surviving Sepsis Campaign guidelines for management of severe sepsis and septic shock. Crit Care Med
2004; 32: 858–873.
4. Dellinger RP, Levy MM, Carlet JM, et al.; International Surviving Sepsis Campaign Guidelines Committee; American Association of Critical-Care Nurses; American College of Chest Physicians; American College of Emergency Physicians; Canadian Critical Care Society; European Society of Clinical Microbiology and Infectious Diseases; European Society of Intensive Care Medicine; European Respiratory Society; International Sepsis Forum; Japanese Association for Acute Medicine; Japanese Society of Intensive Care Medicine; Society of Critical Care Medicine; Society of Hospital Medicine; Surgical Infection Society; World Federation of Societies of Intensive and Critical Care Medicine. Surviving Sepsis Campaign: international guidelines for management of severe sepsis and septic shock: 2008. Crit Care Med
2008; 36: 296–327.
5. Dellinger RP, Levy MM, Rhodes A, et al.; Surviving Sepsis Campaign Guidelines Committee including the Pediatric Subgroup. Surviving sepsis campaign: international guidelines for management of severe sepsis and septic shock: 2012. Crit Care Med
2013; 41: 580–637.
6. Miller GE. The assessment of clinical skills/competence/performance. Acad Med
1990; 65: S63–S67.
7. Lighthall GK, Barr J, Howard SK, et al. Use of a fully simulated intensive care unit environment for critical event management training for internal medicine residents. Crit Care Med
2003; 31: 2437–2443.
8. Ottestad E, Boulet JR, Lighthall GK. Evaluating the management of septic shock using patient simulation
. Crit Care Med
2007; 35: 769–775.
9. Lighthall GK, Harrison TK. A controllable patient monitor for classroom video projectors. Simul Healthc
2010; 5: 58–60.
10. Gaba DM. Improving anesthesiologists’ performance by simulating reality. Anesthesiology
1992; 76: 491–494.
11. Flin R, Maran N. Identifying and training non-technical skills for teams in acute medicine. Qual Saf Health Care
2004; 13(suppl 1): i80–i84.
12. Fletcher G, Flin R, McGeorge P, Glavin R, Maran N, Patey R. Anaesthetists’ Non-Technical Skills (ANTS): evaluation of a behavioural marker system. Br J Anaesth
2003; 90: 580–588.
13. Helmreich RL, Merritt AC, Wilhelm JA. The evolution of Crew Resource Management training in commercial aviation. Int J Aviat Psychol
1999; 9: 19–32.
14. DeVita MA, Schaefer J, Lutz J, Dongilli T, Wang H. Improving medical crisis team performance. Crit Care Med
2004; 32: S61–S65.
15. Davis D, Evans M, Jadad A, et al. The case for knowledge translation: shortening the journey from evidence to effect. BMJ
2003; 327: 33–35.
16. Rodgers DL, Bhanji F, McKee BR. Written evaluation is not a predictor for skills performance in an Advanced Cardiovascular Life Support course. Resuscitation
2010; 81: 453–456.
17. Stiegler MP, Gaba DM. Decision-making and cognitive strategies. Simul Healthc
2015; 10: 133–138.
18. Croskerry P. Clinical cognition and diagnostic error: applications of a dual process model of reasoning. Adv Health Sci Educ Theory Pract
2009; 14(Suppl 1): 27–35.
19. Kahneman D. Thinking Fast and Slow
. New York: Farrar, Straus and Giroux; 2011.
20. Lighthall GK, Barr J. The use of clinical simulation
systems to train critical care physicians. J Intensive Care Med
2007; 22: 257–269.
21. Cheng A, Brown LL, Duff JP, et al.; International Network for Simulation
-Based Pediatric Innovation, Research, & Education (INSPIRE) CPR Investigators. Improving cardiopulmonary resuscitation
with a CPR feedback device and refresher simulations (CPR CARES Study): a randomized clinical trial. JAMA Pediatr
2015; 169: 137–144.
22. Sutton RM, Niles D, Meaney PA, et al. “Booster” training: evaluation of instructor-led bedside cardiopulmonary resuscitation
skill training and automated corrective feedback to improve cardiopulmonary resuscitation
compliance of Pediatric Basic Life Support providers during simulated cardiac arrest. Pediatr Crit Care Med
2011; 12: e116–e121.