Journal Logo

Empirical Investigations

Comparisons of Stress Physiology of Providers in Real-Life Resuscitations and Virtual Reality–Simulated Resuscitations

Chang, Todd P. MD, MAcM; Beshay, Youssef BS; Hollinger, Trevor BS; Sherman, Joshua M. MD

Author Information
Simulation in Healthcare: The Journal of the Society for Simulation in Healthcare: April 2019 - Volume 14 - Issue 2 - p 104-112
doi: 10.1097/SIH.0000000000000356
  • Free


Head-mounted display virtual reality (VR) is an immersive simulation experience providing a 360-degree virtual environment. Virtual reality has the capacity to provide a high level of audiovisual immersion because it allows the real environment to be completely occluded.1,2 Careful development of the audiovisual environment enables VR simulation to fully immerse the subject as a form of experiential learning. The immersive realism of VR environments can be used as desensitization therapy, such as for posttraumatic stress disorder.3 In healthcare training, VR can replicate scenarios in which there is a high amount of stressful audiovisual stimuli.

Resuscitations are stressful healthcare events that are frequently explored in manikin-based simulation because of their high-stakes, low-frequency nature. These scenarios include critical situations ranging from neurological emergencies to cardiopulmonary arrest and are known to be a source of significant mental load4 and stress. Much of the stress experienced by the resuscitation team leader comes from the need for rapid information processing, situational awareness, and decision-making, rather than physical tasks, which are left to other team members.4

Stress in real resuscitations has been measured both subjectively and through physiological changes, which include heart rate (HR) and biomarkers such as cortisol levels.5–7 Physiological measures serve as correlates of behavioral measures or manifestations of psychological constructs. In addition, simulations that effectively replicate a busy scenario have also been shown to induce similar physiological changes, including increases in HR, narrowing in HR variability (HRV), and increased levels of biomarkers such as amylase and cortisol.8–11 Markers in other studies on stress environments even look at pupillary changes.12

However, studies that compare real and simulated resuscitations are rare; Dias and Scalabrini Neto6 and Daglius Dias and Scalabrini Neto13 found no difference in stress physiology markers (HR and amylase) between mannequin-based and real resuscitations led by internal medicine residents, although both HR and amylase levels rose significantly in both environments.6,13 These physiological changes are indirect evidence of psychological fidelity or realism, which is the ability of the simulation to produce the emotions and environmental stress inherent in the actual event.14–16 Although data by Dias and Scalabrini Neto6 and Daglius Dias and Scalabrini Neto13 seem promising for mannequin-based simulations, no study has demonstrated whether the audiovisual immersion of VR could cause physiological changes similar to reality. In theory, if a simulation were to perfectly replicate the real ED environment, then participants in that simulation should experience the same HR change as they would in reality (although the magnitude of individual HR responses will vary based on the presence of confounding variables such as total years of experience or specific experiences with acute situations).17

The aim of this study was to describe provider physiological stress changes within VR resuscitations developed specifically to recreate the high mental workload of leading a pediatric emergency department ED resuscitation. We hypothesized that the physiological measures of HR and salivary cortisol would be equivalent between VR resuscitations and real ED resuscitations when measured among pediatric emergency medicine (PEM) providers.



This pilot study occurred in two phases at a single institution. The ED phase was conducted in an urban, tertiary care pediatric ED with 82,000 visits and approximately 400 critical resuscitations annually. Data were collected from August 2016 to February 2017. The VR phase occurred in an office removed from the ED, during nonclinical hours from March 2017 to June 2017. The study was approved by the institutional review board.


Inclusion criteria were physicians either board-certified or board-eligible in PEM: attendings and fellows. Exclusion criteria include preexisting heart or adrenal conditions or pregnancy in either phase of the study.18–20 T.P.C. and J.M.S. were excluded from analysis because of their knowledge of the VR scenarios.

Virtual Reality Development

Virtual reality scenarios were developed using the Unity 3D (Unity Technologies, San Francisco, CA) platform by dedicated developers and programmers (a.i.Solve Ltd, London, England, UK; BioFlightVR, Culver City, CA) for use on the Oculus Rift Touch (Oculus from Facebook, Menlo Park, CA) head-mounted system. The two physician authors (T.P.C., J.M.S.) acted as subject matter experts. The scenarios were designed to reflect two common pediatric resuscitation scenarios (infant status epilepticus and pediatric anaphylactic shock) and situated the user at the foot of the bed as code leader. Both scenarios had significant airway, breathing, or circulation problems that matched an emergency severity index (ESI) 1 or 2 resuscitation. Hand-held controllers allowed the user to select appropriate physical examinations and treatment options. Both scenarios used branched-chain algorithms to alter the virtual patient's physiology beneficially or adversely depending on the user's actions. Avatars representing the nurse and respiratory therapist provided verbal cues along algorithms. Development was completed for 7 months with multiple iterations and feedback from all team members.

A tutorial to orient users to VR and the Rift Touch controllers was also developed. Scenario difficulty for this study was maximized to require intubation and cricothyrotomy, respectively. Additional environmental distractors including monitor noises, extraneous overhead pages, and a distraught mother in the periphery were also used for this study (Figs. 1, 2). These distractors were chosen based on the authors' experience of common audiovisual stressors accompanying pediatric resuscitations that were possible to render within VR.

Screenshots of head-mounted VR resuscitation scenarios for pediatric anaphylactic shock.
Screenshots of head-mounted VR resuscitation scenarios for pediatric anaphylactic shock.

Emergency Department Phase

Subjects were monitored for 4- or more than 8-hour ED shifts starting between 10:00 and 16:00 to minimize diurnal cortisol variation. Subjects were fitted with a Hexoskin (Montreal, QC, Canada) suit under normal work clothes, which recorded HRs using Bluetooth and still allowed normal movement free of wires or tethered devices.21 Salivary cortisol samples were collected at three scheduled points during an ED shift: 0 hours (preshift), +4 hours (midshift), and +8 hours (after shift). Salivary samples were collected using oral Swabs (Salimetrics, LLC, Carlsbad, CA) and frozen at −20°C for 24 hours, then −80°C before transport.11,22 Dietary and caffeine habits, although they could influence physiologic parameters,23 were not changed to capture a normal shift without affecting usual provider habits. Although salivary cortisol studies often require subjects to fast and not use caffeine,24 we felt it unethical to require normal patient care while fasting or uncaffeinated. A nonclinical research assistant (RA) monitored the patient census and the subject from a different room throughout each subject's shift.

Additional physiological samples were taken during events. Events were defined as any patient triaged on ESI level 1 (any chief complaint) or level 2 (with a chief complaint of allergic reaction or seizure). These levels match the ESI levels intended in the two VR scenarios. An event was tracked only when the subject was a direct participant (eg, code leader, airway, etc) in the event. An additional salivary sample was collected within 1 hour by the RA using the procedure described previously, and a paper Task Load Index (NASA-TLX) form was administered to determine the perceived workload.25,26 NASA Task Load Index scores have been used in other medical contexts to evaluate the workload of providers in resuscitation events.27 Events occurring within 1 hour of a scheduled salivary sample replaced the scheduled sample collections. Because these ESI level 1 or 2 events were unpredictable, the number of events differed by subject.

Virtual Reality Phase

Subjects participated in a single 1-hour VR trial, scheduled between 10:00 and 15:00, to run through two VR scenarios; the order of the two scenarios was randomized. Subjects wore Hexoskins for continuous HR collection and had three scheduled salivary cortisol measurements. Before entering VR, subjects were asked to color for 10 minutes while a salivary cortisol level was obtained. Virtual reality controls were explained by the RA using a standardized script and illustrative screenshots. The subject then entered a VR tutorial and proceeded to the first scenario. A 10-minute interlude followed out of VR, where the subject documented their medical care, completed a paper-based NASA-TLX, and volunteered a salivary cortisol level. During this interlude, ambient ED noises were played to prevent relaxation. Subjects then re-entered VR repeating the tutorial before completing the second scenario. They then exited VR to complete further documentation, a NASA-TLX, and their final salivary cortisol level.


The primary independent variable was modality: defined as either “ED”—during a real shift—or “VR”—during a virtual scenario. In addition, confounding variables included the following: training level, video game experience, average daily caffeine intake,23 clinical experience, and β-blocker or β-agonist use.

The primary outcome variable of HR reported in beats per minute was extracted using Vivosense software (Vivonetics, Newport Coast, CA). This software pulled electrocardiogram data, removed erroneous HRs due to artefacts, and then calculated an HR for every beat. Five minutes of continuous HRs were averaged during the time of salivary cortisol collection to derive a mean HR. For events, the highest 5-minute interval of continuous HR between the event and the cortisol collection was used.

The primary outcome variable of salivary cortisol level was processed through Salimetrics at their laboratory in two aliquots and reported as an average in microgram per deciliter. Salivary cortisol and HR were both measured at the same time interval while the subject was still (sitting, typing).

Secondary outcome variables included the raw (unweighted) NASA-TLX score, expressed as a mean of all six subscores of the NASA-TLX expressed numerically between 0 and 100.25,28 We also collected individual subscores in the NASA-TLX construct domains. Higher scores indicated higher mental workload.

Data Analysis

Sample size was fixed at the number of eligible subjects. The null hypothesis for the equivalence study was that there was a reported difference between delta HR in the ED vs the delta HR in VR, which were defined previously as the HR difference between an event vs a nonevent in the ED as well as the HR difference between the VR vs pre-VR. We used a 10% difference from a neutral HR of 80 and a standard deviation extrapolated from Hartanto et al's29 study of stressful virtual scenarios, which corresponded to an 8 ± 11-bpm expected difference (Cohen's d of 0.73), to define the threshold below which responses would be considered equivalent. This was set as the quantitative boundary above, which the null hypothesis of equivalence could no longer be refuted. The sample size with the quantitative margin of 72 to 88 bpm with an α value of 0.05 and power of 0.8 required a sample size of 18.30,31 A post hoc power analysis was planned.

Descriptive statistics summarized the shifts and outcome variables, using medians and interquartile ranges (IQR) to characterize pooled variables. Data for the two VR scenarios were averaged. Analyses were then split into five parts.

First, we described any trends in HRs and cortisol levels during an ED shift without events with a Friedman's one-way analysis.

Then, we analyzed whether a shift with an event in the ED resulted in higher HRs or cortisol levels than a shift without an event. We used a Wilcoxon signed rank test to determine a within-subject change. This difference was quantified using a Hodges-Lehman approach with a 95% confidence interval (CI) and termed the delta HR and delta cortisol, respectively.5,32

Third, we did a similar analysis in the VR phase, comparing mean HRs and cortisol levels during the two VR resuscitations to the pre-VR HR and cortisol level. A Wilcoxon signed rank test was also used, and delta HR and delta cortisol values for this phase were also calculated.

Fourth, we performed equivalence testing using 95% CI of the median to test our primary hypothesis of equivalence: the null hypothesis was that there would be a difference, and the alternative hypothesis was that there was not a difference in either direction, thus concluding equivalence.30 This was done by comparing the 95% CI of the median delta HR during ED phase to the 95% CI of the median delta HR in the VR phase. Overlap in the CIs of the two delta HRs would have suggested equivalence at two-tailed α of 0.05.33 Because of an expected low sample size, we also used a reverse test as described by Parkhurst,34 in which the null and alternative hypotheses were exchanged, to look for significant differences. The reverse test examined the null hypothesis of no difference between delta HRs from the ED vs VR and an alternative hypothesis that there would be a difference. This determines whether a failed equivalence was likely due to inadequate sample size (if both tests fail to disprove their null hypotheses) or if there was a statistical difference instead.34 The reverse test used a Wilcoxon signed rank between matched pairs.

The delta HRs were considered equivalent and not significantly different if the CIs demonstrated an overlap and the two-tailed P value of the Wilcoxon signed rank test was greater than 0.05. The delta HRs were considered different and not equivalent if the CIs did not overlap, and a two-tailed Wilcoxon signed rank test was significant at the 0.05 level. If neither test yielded significant results, insufficient sample size would be suspected. An identical analysis was conducted for delta cortisol levels.

Finally, differences in NASA-TLX scores and subscores between the two modalities were assessed using a Wilcoxon signed rank test. Spearman rank correlation (ρ) was used to test the relationships between NASA-TLX scores and HRs and cortisol levels in both phases. Analyses were performed on SPSS version 24 (IBM Corporation, Armonk, NY).


Descriptive Results: Subjects and ED Shifts

Sixteen of 29 eligible subjects were enrolled in this study; Figure 3 shows the inclusion, exclusion, and enrollment numbers. Among the 16 subjects, 69 shifts were monitored for this study. Nine shifts (spread among 4 subjects – 2 with complete data loss) had total missing HR data because of technical issues; no such issues were found with cortisol levels procured during those nine shifts. Fourteen of the 16 experienced a cumulative 31 events. Thirteen events overlapped with scheduled cortisol procurement times. Table 1 and Table 2 provide further details.

The CONSORT diagram of enrolled subjects.
Demographic Data for 16 Subjects
Shift Characteristics for ED Phase (n = 69 shifts)

Trends in HRs and Cortisol Levels During an ED Shift Without Events

Both median HRs and median cortisol levels fell throughout the evening shifts when no events occurred (P < 0.001, Table 3).

Heart Rates and Cortisol Levels: Pooled Median (IQR)

Emergency Department Phase: Events vs Nonevents

When comparing all 31 events with nonevent values, both the delta HR and delta cortisol values were significantly positive. Delta HR was +13.9 (95% CI, 9.5 to 18.3, P < 0.001) and delta cortisol was +0.10 (95% CI, 0.03–0.27, P = 0.006) μg/dL.

Virtual Reality Phase: VR Scenario vs Baseline

The delta HR value was +6.5 bpm (95% CI, 4.6–9.7, P < 0.001). This did not differ between the two randomization schemes (P = 0.8). The delta for before and after cortisol levels only trended toward significance: −0.02 μg/dL (95% CI, −0.05 to 0, P = 0.05).

Primary Hypothesis: Equivalence Testing of Delta HR and Cortisol Between ED Phase and VR Phase

For HR, the delta HR in the ED phase spanned a 95% CI of the median from 9.16 to 20.40. In the VR phase, the 95% CI of median delta HR was 2.25 to 7.90. For cortisol levels, the delta cortisol in the ED phase had a 95% CI of the median between 0.03 and 0.21. The 95% CI of median delta cortisol in the VR phase was −0.08 to −0.01. A post hoc power analysis yielded an achieved power of 0.61 using delta HR as the outcome variable.

Reverse Test: Testing of Differences Between Delta HR and Cortisol Between ED Phase and VR Phase34

The delta HRs were significantly different (P = 0.023) as were the delta cortisol levels (P = 0.004), with larger increases occurring in the ED phase. Both outcome variables were different and not equivalent between the ED phase and the VR phase.

Secondary Outcomes – NASA-TLX

Subjects had higher NASA-TLX scores for VR scenarios than they did for real events, with a + 26 score difference (95% CI, +9.4 to +41.1, P = 0.01). Among the subscores, temporal demand, performance, and frustration scores demonstrated higher workload measurements, whereas mental demand, physical demand, and effort scores did not differ between VR and the real resuscitations. Table 4 shows the comparisons.

Median (IQR) NASA-TLX Scores and Subscores

Heart rates during events did not correlate with NASA-TLX perceived workload (P > 0.13). However, the mean cortisol levels and delta cortisol values correlated with the mean event NASA-TLX workloads (ρ = 0.58, P = 0.04, and ρ = 0.61, P = 0.03) in the ED.

Years of clinical experience correlated positively with the summative NASA-TLX scores for both VR and ED phases (ρ = 0.53, P = 0.03, and ρ = 0.56, P = 0.05). It did not account for the differences in HR or cortisol levels or deltas (P > 0.07).

Cortisol levels strongly correlated with HRs at all measurement points for events and VR scenarios (ρ > 0.65, P < 0.006).

Additional demographic characteristics were not associated with changes in HR or cortisol level changes. These include sex (P > 0.2), reported caffeine intake (P > 0.1), and video game experience (P > 0.2). Video game experience was also not associated with NASA-TLX workload values within VR (P = 0.8).


Our data support the notion that resuscitation events in either modality trigger some HR increases, but we did not find equivalence of HR or cortisol level increases between VR and real resuscitations. The implications of HR, cortisol and the relationship to psychological fidelity are explored in our innovative pilot study.

Fidelity is an important concept for medical simulation.14,16 Key to the learning process in simulation—whether mannequin-based or VR-based—is the provision of experiential learning.35,36 The realism for the context matters. Perfect fidelity is not necessary, though, and selective attention to fidelity to maximize learning can be effective37 and cost-saving for VR programming. In this study, we sought to discover objective measures for psychological fidelity using physiological measurements performed in a mannequin-based simulation into a VR-based simulation training.8,11,21,38 We did not directly compare a mannequin-based simulation to a VR simulation, because these are complementary simulation modalities rather than replacements for each other.1

Heart Rates

Emergency providers in resuscitations and emergency medical procedures demonstrate increased HRs as a marker of the stress of the resuscitation. For example, emergency medicine residents increased HRs by greater than 50 bpm for resuscitations in one study,6 and the increase was associated with the number of procedures required. Our data also demonstrate significant HR increases after both real resuscitation events and VR-simulated events, with a higher delta in the real resuscitation. Our delta is not as high as Dias and Scalabrini Neto6 and Daglius Dias and Scalabrini Neto13 for two reasons. Their methodology used a maximum HR rather than our averaged HR for 5 minutes, which likely accounts for their very high delta. Alternatively, our calculation method of the delta was also unique, in that we had nonresuscitative shifts at the same times of the day as a baseline comparison, rather than a relaxed state before their measurement. The device choice—using a fitted Hexoskin shirt that otherwise does not restrict movement21—compared with a standard Holter monitor may have also led to a difference. The shirt was designed for athletic activities; because physical activity could have inflated HRs, the data reflect all times when subjects were still (and typing) for 5 minutes.

That HRs went up regardless of the real or simulated event can be counted as some evidence of psychological fidelity for a stressful resuscitation. Although our original hypothesis for an equivalent HR change was not substantiated, we found a statistically significant difference between the delta HRs: +13.9 bpm in the ED vs +6.5 in VR through the reverse test.34 It is possible that the VR version was still less overwhelming than the real events or that the real ED shift added additional stressors that were not replicable within a VR session (eg, additional patients, stressful patient and staff interactions, a long day, etc). An HR delta of +6.5 is comparable with other literature describing HR changes in stressful VR scenarios, which describe +4- to +7-bpm changes for adult subjects undergoing a VR “blind date” or a VR “job interview.”29 Extremely high HR could represent overwhelming workload or stress, which can impede appropriate learning and reflection.39–41 On the other hand, a small but consistent rise in an HR delta of +6.5 provides promise that VR is an engaging simulation modality beyond passive learning measures.42 It is unknown what this entails for the future of VR as desensitization therapy, training simulation, or assessment, and further study on physiological stress in education is still warranted.

Cortisol Levels

Salivary cortisol levels have concordance with gold standard plasma cortisol levels.24 Within the field of resuscitation, cortisol levels have been studied in emergency medicine residents, which correlate (r = 0.4) with perceived stress for an entire shift. Interestingly, no significant correlations between “near-miss” events and cortisol levels were found,5 which differ from our slight increase in cortisol levels in ED events.

Our VR simulations did not increase cortisol levels appreciably. Bong et al's11 mannequin-based simulation led to a median delta cortisol level of +0.09 μg/dL among gastroenterology fellows and attendings in a team resuscitation setting. It is possible that our PEM providers have higher experience levels with resuscitation events and thus was blunted in the simulated setting. Alternatively, there may be a team communication element that led to increased cortisol levels in the gastroenterology crises simulations. The VR in this study was meant to focus on the perspective of a resuscitation leader and did not require verbal communication.

That the VR cortisol response did not match the responses found in real resuscitation events should not be discouraging for the VR. If the VR is meant as an educational tool (eg, learning resuscitation skills), high delta cortisol levels may be deleterious. For example, delta cortisol levels in military subjects of +0.8 μg/dL are associated with poorer recall and information processing, particularly in “visuospatial declarative memory recall.”43 These deficits persisted even to the next day.43 The literature linking cortisol responses—ie, the delta cortisol values in our study—to optimal learning suggests that the cortisol level itself is unlikely to be the sole factor or marker. The magnitude of cortisol response is not tied to magnitude of learning or new memory formation.44 The optimal cortisol level for a simulation to be useful for learning is yet to be found8 and is unlikely to be a simple sliding scale.

Perceived Workload vs Physiological Markers

Of note, our NASA-TLX scores did not correlate with any objective measures, although the overall score and three subscores were substantially higher in the VR phase. We propose that the VR controls posed significant challenges among the cohort of seasoned physicians, as supported by the NASA-TLX differences, particularly frustration. Within our cohort of seasoned physicians with experience at a high-acuity pediatric ED, the ED environment and system is very familiar and even expert, whereas the VR controls, buttons, and maneuvering within VR is quite unfamiliar. No subjects had prior use of head-mounted VR before this study. It is possible that rising frustration with the VR controls among the more seasoned subjects caused an increase in physiological stress that negated the expected decrease in stress we would otherwise expect based on their greater experience with the medical management of acute scenarios. This, in turn, led to the lack of difference we observed. We did not, however, find video game experience to affect HR, cortisol, or NASA-TLX score.

Finally, frustration within VR may have led subjects to disengage from the scenario, which could also explain the minimal change in cortisol levels. Other mannequin-based simulations have shown increases in cortisol levels, so it is worth considering in future studies why our strategy of VR did not elevate cortisol levels to the extent that other simulations did.11,13 No change in cortisol level does not mean that the VR is unusable, because the “minimum” required stress within a simulation for optimal learning is still elusive.8


The study has several limitations. Cortisol levels are prone to diurnal variation particularly among individuals with unpatterned day-night work schedules, as well as for caffeine and dietary changes—none of which we could ethically control for during the ED shifts.23 Although we accounted for shifts times to tighten the diurnal cortisol level patterns, we could not adjust for whether a night shift was done before a monitored shift.45 In addition, it is probable that caffeine intake was higher during an actual shift (ED phase) than during an off day (VR phase). Untracked caffeine intake may account for a higher delta HR in the ED than while within VR, although it should have likewise affected all other HRs during the ED phase. Though validated in other studies,21 the HRs we obtained through the Hexoskin required some manipulation to remove artifact values caused by vigorous movement, and two subjects experienced technical difficulties that required the exclusion of their HR data.

Heart rate variability, though used in other studies to characterize physiological stress, was not calculated for this study. Delta HRs seem to capture similar data as narrowed HRV because they are both influenced by increased autonomic sympathetic responses.46 However, future studies should embrace using HRV as a measurement within VR simulations.

Coloring was introduced to obtain a baseline HR and cortisol level during the VR phase but was not replicated for logistical reasons before the ED shifts. Although this may explain the higher HRs preshift, coloring did not artificially reduce the pre-VR HR sufficiently to support an equivalence hypothesis.

Experience is a known confounder for how physiological changes manifest with stress, and the different levels of ED and resuscitation experience may have affected HR or cortisol levels; for example, Joseph et al17 documented blunted HRV among attending trauma surgeons compared with residents during trauma resuscitations as a measurement of physiological stress, despite no correlation to subjectively reported stress. In addition, some ED practitioners are part-time or occupy a shift load that may be light or heavy, which may influence their perception of the resuscitation beyond simple years of experience. Although these are likely to have influenced our raw results, we specifically chose a match-paired analysis to account for these confounders; that is, using each subject as his or her own control should have reduced the effect of experience on the delta HRs or delta cortisol levels. An experienced practitioner who would be predicted to have a blunted HR response in the ED should also have a blunted HR response in VR had the VR been precisely identical to the ED shift.

Because this study focused on provider physiologies, patient data such as type of resuscitation and patient outcomes were not collected. Although the ESI triage system has reliability and validity in a pediatric ED,47 it is possible that overtriaging may have occurred with our level 1 or 2 patients. Theoretically, overtriaging would have lowered a sympathetic response and would not have had such high HR and cortisol changes in the ED.

Finally, our equivalence-based primary hypothesis did not have sufficient sample size, particularly for the HR outcome variable, which also had to remove subjects because of technical failures. Although this raises the question of insufficient power, a significant difference between both delta HRs and delta cortisols using the reverse test would suggest that the sample size was not the issue, rather, that there was a significant difference in outcomes instead of equivalence.34

Future Directions

Because of the novelty of VR as a simulation, best practices for the development and use are currently unknown. It has promise with its strong audiovisual capabilities, but questions on its use as either training, assessment, or desensitization remain. Future research areas include dosing and frequency, features unique to VR that enable learning transfer, and strategies for debriefing when using VR as the simulation modality. In addition, strategies to mitigate the novelty and “foreign” feel of the VR system are needed if VR is a viable simulation modality.


Heart rate increases occur in simulated VR resuscitations but are lesser in magnitude than real resuscitations. The implications of this lower physiological stress response in VR for learning and mental load could be explored further by examining the effects of focused changes to the VR simulations on stress and learning outcomes.


The authors thank Anita Schmidt, MPH, Teva Brender, BS, Jori Richman, BA, and Phung K Pham, MS, for the hard work and dedication. The authors also thank the virtual reality collaborators from Bioflight VR, A.i.Solve, Ltd, and Oculus from Facebook.


1. Chang T, Pusic MV, Gerard JL. Screen-based Simulation and Virtual Reality. In: Cheng A, Grant VJ, eds. Comprehensive Healthcare Simulation - Pediatrics. 1st ed. Switzerland: Springer International Publishing; 2016:686.
2. Chang TP, Weiner D. Screen-based simulation and virtual reality for pediatric emergency medicine. Clin Pediatr Emerg Med 2016;17:224–230.
3. McLay RN, Graap K, Spira J, et al. Development and testing of virtual reality exposure therapy for post-traumatic stress disorder in active duty service members who served in Iraq and Afghanistan. Mil Med 2012;177:635–642.
4. White MR, Braund H, Howes D, et al. Getting inside the expert's head: an analysis of physician cognitive processes during trauma resuscitations. Ann Emerg Med 2018.
5. Arnetz BB, Lewalski P, Arnetz J, Breejen K, Przyklenk K. Examining self-reported and biological stress and near misses among emergency medicine residents: a single-centre cross-sectional assessment in the USA. BMJ Open 2017;7:e016479.
6. Dias RD, Scalabrini Neto A. Acute stress in residents during emergency care: a study of personal and situational factors. Stress 2017;20:241–248.
7. Bong CL, Fraser K, Oriot D. Cognitive load and stress in simulation. In: Cheng A, Grant VJ. Comprehensive Healthcare Simulation - Pediatrics 1st ed. Switzerland: Springer International Publishing; 2016;3–17.
8. Ades A, Nadkarni V, Nishisaki A. Stress in simulation-based resuscitation education-seeking objective measures to spit and swallow? Pediatr Crit Care Med 2017;18:487–489.
9. Ghazali DA, Ragot S, Breque C, et al. Randomized controlled trial of multidisciplinary team stress and performance in immersive simulation for management of infant in shock: study protocol. Scand J Trauma Resusc Emerg Med 2016;24:36.
10. Harvey A, Bandiera G, Nathens AB, LeBlanc VR. Impact of stress on resident performance in simulated trauma scenarios. J Trauma Acute Care Surg 2012;72:497–503.
11. Bong CL, Lightdale JR, Fredette ME, Weinstock P. Effects of simulation versus traditional tutorial-based training on physiologic stress levels among clinicians: a pilot study. Simul Healthc 2010;5:272–278.
12. Szulewski A, Gegenfurtner A, Howes DW, Sivilotti MLA, van Merriënboer JJG. Measuring physician cognitive load: validity evidence for a physiologic and a psychometric tool. Adv Health Sci Educ Theory Pract 2017;22:951–968.
13. Daglius Dias R, Scalabrini Neto A. Stress levels during emergency care: A comparison between reality and simulated scenarios. J Crit Care 2016;33:8–13.
14. Curtis MT, DiazGranados D, Feldman M. Judicious use of simulation technology in continuing medical education. J Contin Educ Health Prof 2012;32:255–260.
15. Rudolph JW, Simon R, Raemer DB. Which reality matters? Questions on the path to high engagement in healthcare simulation. Simul Healthc 2007;2:161–163.
16. Dieckmann P, Gaba D, Rall M. Deepening the theoretical foundations of patient simulation as social practice. Simul Healthc 2007;2:183–193.
17. Joseph B, Parvaneh S, Swartz T, et al. Stress among surgical attending physicians and trainees: a quantitative assessment during trauma activation and emergency surgeries. J Trauma Acute Care Surg 2016;81:723–728.
18. De Weerth C, Buitelaar JK. Physiological stress reactivity in human pregnancy—a review. Neurosci Biobehav Rev 2005;29:295–312.
19. Federenko IS, Wolf JM, Wust S, et al. Parity does not alter baseline or stimulated activity of the hypothalamus-pituitary-adrenal axis in women. Dev Psychobiol 2006;48:703–711.
20. Kudielka BM, Hellhammer DH, Wust S. Why do we respond so differently? Reviewing determinants of human salivary cortisol responses to challenge. Psychoneuroendocrinology 2009;34:2–18.
21. Villar R, Beltrame T, Hughson RL. Validation of the Hexoskin wearable vest during lying, sitting, standing, and walking activities. Appl Physiol Nutr Metab 2015;40:1019–1024.
22. Expanded Range High Sensitivity Salivary Cortisol Enzyme Immunoassay Kit Instructions. 2016. Available at: Accessed May 20, 2018.
23. Lovallo WR, Farag NH, Vincent AS, Thomas TL, Wilson MF. Cortisol responses to mental stress, exercise, and meals following caffeine intake in men and women. Pharmacol Biochem Behav 2006;83:441–447.
24. Hellhammer DH, Wüst S, Kudielka BM. Salivary cortisol as a biomarker in stress research. Psychoneuroendocrinology 2009;34:163–171.
25. Hart SG, Staveland LE. Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research. In: Hancock PA, Meshkati N, eds. Advances in Psychology: North-Holland; 1988:139–83.
26. Levin S, France DJ, Hemphill R, et al. Tracking workload in the emergency department. Hum Factors 2006;48:526–539.
27. Parsons SE, Carter EA, Waterhouse LJ, Sarcevic A, O'Connell KJ, Burd RS. Assessment of workload during pediatric trauma resuscitation. J Trauma Acute Care Surg 2012;73:1267–1272.
28. Hart SG. NASA-Task Load Index (NASA-TLX); 20 years later. Proceedings of the Human Factors and Ergonomics Society Annual Meeting. Los Angeles, CA: Sage Publications Sage CA; 2006:904–908.
29. Hartanto D, Kampmann IL, Morina N, Emmelkamp PG, Neerincx MA, Brinkman WP. Controlling social stress in virtual reality environments. PloS One 2014;9:e92804.
30. Greene CJ, Morland LA, Durkalski VL, Frueh BC. Noninferiority and equivalence designs: issues and implications for mental health research. J Trauma Stress 2008;21:433–439.
31. Walker E, Nowacki AS. Understanding equivalence and noninferiority testing. J Gen Intern Med 2011;26:192–196.
32. Hunziker S, Semmer NK, Tschan F, Schuetz P, Mueller B, Marsch S. Dynamics and association of different acute stress markers with performance during a simulated resuscitation. Resuscitation 2012;83:572–578.
33. Cumming G. Inference by eye: reading the overlap of independent confidence intervals. Stat Med 2009;28:205–220.
34. Parkhurst DF. Statistical significance tests: equivalence and reverse tests should reduce misinterpretation: equivalence tests improve the logic of significance testing when demonstrating similarity is important, and reverse tests can help show that failure to reject a null hypothesis does not support that hypothesis. Bioscience 2001;51:1051–1057.
35. Kolb D. Experiential Learning: Experience as The Source of Learning and Development. Englewood Cliffs, NJ: Prentice-Hall; 1984.
36. Issenberg SB, McGaghie WC, Hart IR, et al. Simulation technology for health care professional skills training and assessment. JAMA 1999;282:861–866.
37. Norman G, Dore K, Grierson L. The minimal relationship between simulation fidelity and transfer of learning. Med Educ 2012;46:636–647.
38. Lizotte MH, Janvier A, Latraverse V, et al. The impact of neonatal simulations on trainees' stress and performance: a parallel-group randomized trial. Pediatr Crit Care Med 2017;18:434–441.
39. Fraser K, Ma I, Teteris E, Baxter H, Wright B, McLaughlin K. Emotion, cognitive load and learning outcomes during simulation training. Med Educ 2012;46:1055–1062.
40. Fraser KL, Ayres P, Sweller J. Cognitive load theory for the design of medical simulations. Simul Healthc 2015;10:295–307.
41. Holzinger A, Kickmeier-Rust MD, Wassertheurer S, Hessinger M. Learning performance with interactive simulations in medical education: lessons learned from results of learning complex physiological models with the HAEMOdynamics SIMulator. Comput Educ 2009;52:292–301.
42. Issenberg SB, McGaghie WC, Petrusa ER, Lee Gordon D, Scalese RJ. Features and uses of high-fidelity medical simulations that lead to effective learning: a BEME systematic review. Med Teach 2005;27:10–28.
43. Taverniers J, Van Ruysseveldt J, Smeets T, von Grumbkow J. High-intensity stress elicits robust cortisol increases, and impairs working memory and visuo-spatial declarative memory in Special Forces candidates: a field experiment. Stress 2010;13:323–333.
44. Shields GS, Sazma MA, McCullough AM, Yonelinas AP. The effects of acute stress on episodic memory: a meta-analysis and integrative review. Psychol Bull 2017;143:636–675.
45. Machi MS, Staum M, Callaway CW, et al. The relationship between shift work, sleep, and cognition in career emergency physicians. Acad Emerg Med 2012;19:85–91.
46. Slamon N, Penfil SH, Nadkarni VM, Parker RM. A prospective pilot study of the biometrics of critical care practitioners during live patient care using a wearable “Smart Shirt”. J Intensive Crit Care 2018;4:10.
47. Green NA, Durani Y, Brecher D, DePiero A, Loiselle J, Attia M. Emergency Severity Index version 4: a valid and reliable tool in pediatric emergency department triage. Pediatr Emerg Care 2012;28:753–757.

Virtual reality; resuscitation; emergency medicine; pediatric emergency medicine; hydrocortisone; simulation; stress physiology

Copyright © 2019 Society for Simulation in Healthcare