Secondary Logo

Journal Logo

Effects of Low- Versus High-Fidelity Simulations on the Cognitive Burden and Performance of Entry-Level Paramedicine Students

A Mixed-Methods Comparison Trial Using Eye-Tracking, Continuous Heart Rate, Difficulty Rating Scales, Video Observation and Interviews

Mills, Brennen W. BSc (Hons); Carter, Owen B.-J. DPsych; Rudd, Cobie J. PhD; Claxton, Louise A. RN; Ross, Nathan P. BSc; Strobel, Natalie A. PhD

doi: 10.1097/SIH.0000000000000119
Empirical Investigations

Introduction High-fidelity simulation-based training is often avoided for early-stage students because of the assumption that while practicing newly learned skills, they are ill suited to processing multiple demands, which can lead to “cognitive overload” and poorer learning outcomes. We tested this assumption using a mixed-methods experimental design manipulating psychological immersion.

Methods Thirty-nine randomly assigned first-year paramedicine students completed low- or high-environmental fidelity simulations [low–environmental fidelity simulations (LFenS) vs. high–environmental fidelity simulation (HFenS)] involving a manikin with obstructed airway (SimMan3G). Psychological immersion and cognitive burden were determined via continuous heart rate, eye tracking, self-report questionnaire (National Aeronautics and Space Administration Task Load Index), independent observation, and postsimulation interviews. Performance was assessed by successful location of obstruction and time-to-termination.

Results Eye tracking confirmed that students attended to multiple, concurrent stimuli in HFenS and interviews consistently suggested that they experienced greater psychological immersion and cognitive burden than their LFenS counterparts. This was confirmed by significantly higher mean heart rate (P < 0.001) and National Aeronautics and Space Administration Task Load Index mental demand (P < 0.05). Although group allocation did not influence the proportion of students who ultimately revived the patient (58% vs. 30%, P < 0.10), the HFenS students did so significantly more quickly (P < 0.01). The LFenS students had low immersion resulting in greater assessment anxiety.

Conclusions High–environmental fidelity simulation engendered immersion and a sense of urgency in students, whereas LFenS created assessment anxiety and slower performance. We conclude that once early-stage students have learned the basics of a clinical skill, throwing them in the “deep end” of high-fidelity simulation creates significant additional cognitive burden but this has considerable educational merit.

From the Office of the Deputy Vice Chancellor (Strategic Partnerships) (B.W.M., O.B.J.C., C.J.R.) and School of Medical Sciences (L.A.C.), Edith Cowan University, Joondalup Western Australia; School of Medicine (N.P.R.), Deakin University, Geelong, Victoria, Australia; and School of Paediatrics and Child Health (N.S.), University of Western Australia, Crawley, Western Australia.

Reprints: Brennen Mills, BSc, Office of the Deputy Vice Chancellor (Strategic Partnerships), Edith Cowan University, 270 Joondalup Drive, Edith Cowan University, Joondalup, WA, Australia, 6027 (e-mail:

Supported by the Australian Government Department of Health grant G1001297.

The authors declare no conflict of interest.

N.P.R. is formerly affiliated with School of Medical Sciences, Edith Cowan University, and N.S. is formerly affiliated with Office of the Pro Vice Chancellor (Health Advancement), Edith Cowan University.

When educating healthcare students, the extent to which simulation-based learning environments should attempt to replicate the dynamic aspects of real-world settings is of particular interest. Typically, education experts recommend a progressive continuum from low-fidelity simulation (LFS) to high-fidelity simulation (HFS), where early-stage students learn via LFSs with minimal environmental distractions until proficiency of a clinical skill is mastered, after which time students should be exposed to increasingly HFSs with multiple concurrent stimuli that better replicate real-world demands.1,2 We are cautioned against using HFS for early-stage learners whose inexperience makes it difficult to prioritize between multiple environmental stimuli resulting in loss of situational awareness and cognitive overload.3 Beaubien and Baker1 exemplify this stance, stating “We implore (educators) to at least explore the use of lower-fidelity alternatives, especially during the earliest phases of…skill acquisition.”(p55) These recommendations are consistent with the challenge point framework (CPF) that predicts that optimal learning is achieved when students are provided with levels of challenge that are difficult, but achievable, within their current theoretical understanding.4 The CPF predicts that performance becomes suboptimal if the challenge is set too high, causing cognitive overload, or set too low, leading to low task engagement.4 Thus, using the CPF to interpret the progressive continuum, we are warned that entry-level students might find HFS too difficult, leading to cognitive overload and suboptimal performance.

Although intuitively appealing, the evidence supporting this progressive continuum of LFS to HFS remains equivocal. It has been consistently reported that HFS training results in high levels of student satisfaction and improved confidence.5–8 However, these are subjective measures that are known to be poor predictors of students’ actual performance, as rated independently by clinical assessors.9 Systematic reviews are consistently critical of the quality of most published research investigating fidelity in simulation-based learning for relying on single-group analyses with no comparison group, comparing groups receiving unequal dosages of learning or using inappropriate comparison groups (eg, didactic learning) rather than other forms of experiential learning.5,8,10 The aim of the study was therefore to use a range of objective measures and a rigorous methodology to test the assumptions underlying the progressive continuum of fidelity for simulation-based education of healthcare students.

To operationalize simulation “fidelity,” we used the framework by Rehmann et al11 who describe the following 3 components: equipment, environmental, and psychological fidelity. When applied to healthcare education, we operationalized “equipment fidelity” as the functionality and responsiveness of patients, manikins, and medical instruments. We operationalized “environmental fidelity” as simulated concurrent stimuli competing for participant attention that emulates demands existing in the real world. Finally, we operationalized “psychological fidelity” as the extent to which a simulation provides minimal interruption to the natural “flow” of a clinical scenario and facilitates suspension of disbelief and participant immersion within the scenario. Previous researchers suggest that psychological fidelity is usually increased by providing high equipment and/or environmental fidelity.1,12 We used this framework to interpret the underlying rationale for the progressive continuum of fidelity in simulation-based learning, essentially which multiple extraneous stimuli in HFS refer specifically to high environmental fidelity. From this rationale, we designed a mixed-methods, 2 comparison-group trial. Environmental fidelity was manipulated between low and high levels, whereas equipment fidelity was held constant to ensure similar physical requirements for both groups, and we attempted to use a wide variety of measures to monitor the consequent effects on psychological fidelity and cognitive burden (Fig. 1). Through this paradigm, we tested the following hypotheses—based on the assumptions of the progressive continuum—that early-stage students undertaking a simulation-based clinical task in high–environmental fidelity simulations (HFenSs) compared with low–environmental fidelity simulations (LFenSs) will:



  • H1: Experience greater psychological fidelity
  • H2: Experience greater cognitive burden
  • H3: Perform the clinical task worse
Back to Top | Article Outline



Our participant pool included all students (N = 52) enrolled in a first-year paramedicine clinical skills unit entitled “Introduction to Paramedical Practice” in 2013 at Edith Cowan University (ECU), Western Australia. Recruitment took place at lectures and by way of online postings to the faculty website. Students were asked to volunteer for the study with the offer of additional simulation-based practice. None had participated in HFSs before nor did we advise them that they might be exposed to such. In total, 39 students volunteered, representing a consent rate of 80%. The sample had a mean (range) age of 22.7 (18–39) years and was 51% female. Twenty students were randomly assigned to the LFenS condition and 19 to the HFenS. The study protocol was approved by the ECU Human Ethics Committee (#9834).

Back to Top | Article Outline



Paramedicine teaching staff at ECU identified clinical conditions that could be simulated in both LFenS and HFenS while still maximizing discrimination between students’ varying levels of clinical competency. The selected clinical scenario involved dispatch to a nightclub for an arm laceration. When attending to the patient, another man collapses on the dance floor with an obstructed lower airway, is nonresponsive, is not breathing, and gradually becomes pulseless after 3 minutes. Visual assessment of the airway with standard triple-airway maneuver reveals no obvious obstruction. No chest rise results from use of the bag-valve mask, suggesting obstructed airway. The use of a laryngoscope reveals a bottle cap removable with Magill forceps enabling full patient recovery. Students had been taught these skills 3 weeks before data collection.

Back to Top | Article Outline


Equipment fidelity was held constant for both LFenS and HFenS conditions. An advanced patient simulator SimMan 3G manikin (Laerdal, Oakleigh, Australia) served as the collapsed patient with obstructed airway in both conditions. All students entered the scenario equipped with a standardized paramedicine pack, including laryngoscope and Magill forceps, and defibrillator with electrocardiographic monitor. Pilot testing revealed that early-stage paramedicine students were unfamiliar with some features of the SimMan 3G (eg, carotid and radial pulses, cyanosis indicated by blue lights on the lip). As such, a 5-minute familiarization training protocol was developed for students to undertake before the start of each scenario.

Back to Top | Article Outline


Students in both LFenS and HFenS conditions were accompanied by a confederate playing the role of “assistant paramedic” whose role was to take direction from the student but to volunteer no advice. Both conditions took place within the same rooms of the ECU Health Simulation Centre. For LFenS, the room was well lit, quiet, and devoid of props, other than the SimMan 3G lying in the middle of the floor. For HFenS, the room was dark but featured dynamic disco lighting, a video projection of a crowd of dancers on 1 wall, and music playing loudly on a continuous loop. In addition to the SimMan 3G lying on the floor, live actors played the roles of a highly distressed girlfriend kneeling beside the SimMan 3G, an impatient bouncer, and a cantankerous drunkard. These actors delivered the following 5 “scenes” standardized across all HFenS scenarios, including: (1) the distressed girlfriend frantically and repeatedly asking the student things like “Is he going to be OK?” and “What’s wrong with him?”; (2) the drunkard loudly entering from the “toilet” door and asking what was happening; (3) the drunkard trying to “help” by rummaging through the paramedic pack; (4) the bouncer pacing about impatiently asking the student “How much longer is this going to take?”; and (5) the bouncer trying to forcefully remove the drunkard, both entering into a scuffle and the bouncer pepper spraying the drunkard resulting in screams of pain and his eventual removal from the room. Actors were required to stick closely to the standard script but were permitted to have sufficient creative license to preserve the natural flow of each scenario. Actors, including the confederate, could provide standardized information to students upon request (eg, asking the girlfriend “How much has he had to drink tonight?”). For LFenS, students were instructed to direct such questions to the confederate.

Back to Top | Article Outline


Psychological Fidelity


Reporting participant “immersion” in simulations has previously been achieved through self-completed questionnaires13 or qualitative enquiry.14 Although these are subjective measures, we believe that they are appropriate given that psychological fidelity is largely a subjective construct. Thus, immediately after each simulation, we held face-to-face interviews with students of approximately 10 minutes’ duration. We used a pragmatic, action-research oriented, interpretive inquiry approach15 using nonteaching staff to conduct the interviews and assured participants of anonymity both on the consent form and again at the beginning of the interviews. The interviewers explored with students how realistic they found the equipment and environmental aspects of the simulations, the extent to which they felt “immersed,” and how they felt the various aspects of the simulation affected their performance. These interviews were video recorded and transcribed, and then, a textual analysis was undertaken using QSR NVivo (v.10) software to sort and complete a thematic analysis within the conceptual framework for our study. We also attempted to triangulate our qualitative data with objective measures to infer quantitative differences in psychological fidelity.

Back to Top | Article Outline
Physiological Arousal

To obtain an objective measure of students’ arousal, continuous heart rate (HR) data were collected at 5-second intervals during the simulations using a Polar s610i watch and chest strap (Polar, Kempele, Finland). Cardiovascular reactivity has previously been demonstrated as a robust measure of arousal in sedentary simulated environments (eg, driving and flight simulators).16 However, our students were physically engaged within their simulated environment, potentially introducing confounders to the HR data. As such, we deliberately kept the requirement for physical engagement equal between the comparison groups by holding equipment fidelity constant and ensuring our manipulation of environment fidelity included differences in social but not physical interaction. Hence, we reasoned that any differences in HR reactivity between LFenS and HFenS could be reasonably attributed to differences in psychological fidelity alone reflecting arousal due to participants’ mental and/or emotional reactions.

Back to Top | Article Outline

Cognitive Burden


Perceived cognitive burden was determined using 2 methods. The first was via interviews as described previously (see Psychological Fidelity) wherein students were asked to describe their ability to think clearly during their simulations and how this affected their performance. The second was via the National Aeronautics and Space Administration Task Load Index (NASA-TLX), a paper-and-pencil instrument that required students to rate their perceived burden of the simulation along 20-point scales for 6 dimensions of demand including mental, physical, temporal, performance, effort, and frustration (Table 1). The NASA-TLX underwent a rigorous, 3-year development period17 and has since featured in more than 2850 studies.18 It has previously been used in studies of aviation simulation19 and perceived workloads in the health industry (eg, as shown in Refs. 20–22) and has demonstrated good retest reliability, good internal consistency, and good structural validity.23



Back to Top | Article Outline

Video footage of the simulations was reviewed to identify cases of students appearing to suffer cognitive overload, from the basis of which the following operational definition was formed with the following 3 criteria: (1) verbal indications of cognitive burden such as long successions of “um”s and “err”s or statements such as “What’s wrong with my brain today?” (2) for 10 or more seconds being inactive and indecisive, and (3) providing no instructions to the confederate or other actors during this period. Examples of students meeting these criteria were shown to a convenience sample of teaching staff who confirmed the face validity of the measure (ie, it seemed to be a good indicator of cognitive overload). On the basis of this definition, 2 judges then independently reviewed all students’ video recordings and noted episodes meeting these criteria. Identification agreement was 100%, negating the necessity to calculate an intraclass correlation coefficient.

Back to Top | Article Outline
Eye Tracking

All students wore ASL Mobile Eye eye-tracking glasses (Applied Science Laboratories, Bedford, MA; Fig. 2) to monitor the extent to which students in the HFenS were successfully manipulated into attending to the additional, concurrent stimuli provided within their condition and how these competed with the stimuli also available in LFenS. There is some contention within the literature surrounding eye fixations and attention, which visual attention (ie, where one looks) may not necessarily reflect actual attention. Although there is no guarantee that visual attention does encompass the entirety of neurological attention, the work by Finke et al,24 who demonstrated the concurrent validity of visual attention with 4 established clinical tests of neurological attention, suggest that visual attention can be used, at the very least, as a rough guide to actual attention. We also reasoned that if it could be demonstrated that HFenS students were forced to attend to a greater variety of stimuli than LFenS students, by logical extension, it would have increased their relative cognitive burden.



To maximize objectivity and minimize bias, each student’s eye-tracking footage was coded independently by 2 raters using Studiocode (v5.8.3) software to code frame-by-frame with clearly defined criteria. Raters quantified visual fixations on specific stimuli with the simulated environment, including the manikin, confederate, and equipment for both groups plus the girlfriend, bouncer, and drunkard in HFenS. A fixation was defined as a student’s gaze remaining on a single object for more than 400 milliseconds (or 12 sequential frames at 30 Hz).25 A high degree of reliability was found between our coders with an intraclass correlation coefficient of 0.993 (F37 = 150.435; P < 0.001).

Back to Top | Article Outline


We used 2 objective measures of students’ performance. The first was a simple dichotomous indicator of whether students successfully located and removed the airway obstruction (ie, yes/no). The second was a continuous measure of time-to-termination, based on the principle that greater cognitive burden would inhibit student ability to make efficient clinical decisions. Time-to-termination was calculated from when students first entered the room to treat the collapsed patient to either successful resuscitation or termination of the simulation if the patient was pulseless and the student had not detected the obstruction in the airway after 2 rounds of 5 cycles of cardiopulmonary resuscitation and attempted to commence a third round without further investigation of the airway (see Australian resuscitation protocols26,27).

Back to Top | Article Outline

Statistical Analysis

Fisher exact tests were used to conduct between-group comparisons of dichotomous variables (successful removal of obstruction, instances of cognitive overload). Independent samples t tests were used to examine between-group comparisons of HR and time-to-termination. For analyses containing multiple dependent variables (eye-tracking, NASA-TLX subscores), multivariate analysis of variance (MANOVA) analyses were used to compensate for inflated risk of type 1 error.28 We used G*Power (v.3.1) to estimate that our sample size of 39 would be sufficient to detect an effect size of Cohen d value of 0.8 (α = 0.05; β = 0.20), being the equivalent to a difference between groups’ means (SD) of 8 (10) beats per minute for our HR measure, 1 (1.25) minute for our time-to-termination measure, and ±1 (1.2) on the 20-point NASA-TLX rating scales.

Back to Top | Article Outline



Our environmental manipulation clearly made an impression on the HFenS group who evidenced a high level of immersion within the simulations, for example, “Yeah, wow, that was intense!,” “My heart was pounding really fast!,” and “It was pretty, you know, like powerful having all that stuff going on in the background.” Many HFenS students described experiencing cognitive burden associated with the additional stimuli, for example, “I got distracted so easily and my mind was going a mile a minute,” “I think the fact that there were bystanders there made it a little bit difficult to focus on the systematic approach,” and “it was very, like, you know—bang here, bang there—the simulation was very realistic—I was a bit overwhelmed.” Although generally describing the simulation as stressful, many appreciated the chance to practice under realistic conditions, for example, “You have to learn to deal with distractions so they were OK,” “It was distracting but I guess when you can see that your number 1 priority is the patient you can kind of zone all that other stuff out.

This contrasted sharply with comments from the LFenS group who described very low levels of immersion, for example, “It was quite quiet for me, it was a bit—you know—not real” and “If, for example, there was really the patient’s girlfriend there I think that would have been better.” The LFenS students also considered the simulations to be stressful because of assessment anxiety rather than cognitive burden, for example, “The fact that I was the only one there I really felt like the pressure was on me” and “Even though she (the confederate) is there saying ‘just tell me what to do,’ I kind of know she’s assessing me.” This manifested itself in students being self-conscious and primarily preoccupied with assessment rather than urgency to “save” the patient in the scenario; suggesting low psychological fidelity, for example, “I psyched myself out in this scenario because I was trying so hard to do everything perfect, which took ages” and “In our assessments, we have to do everything textbook or we get told off, no matter how long it takes.” One of the HFenS students made a complementary observation “The realism (of the HFenS) gives you a bit of a kick in the arse, I suppose. I’ve gone through OSCEs (Objectively Structured Clinical Examinations) before and a couple of them have been—not blase—but probably a bit too slow. Whereas this one here (HFenS), you’re kind of ‘F***! This guy’s on the ground—we’ve got to try and do something quick.’

Back to Top | Article Outline

Physiological Arousal

One day of HR data were lost because of equipment failure but fortunately affected both groups equally (LFenS n = 7; HFenS n = 7). Moreover, this meets the criteria for “data missing completely at random” and is therefore unlikely to have biased results.29 Baseline HR data for the 60 seconds immediately before simulation commencement suggested no significant differences between LFenS and HFenS groups (t26 = 1.387; P = 0.177) but some anticipation anxiety [mean (SD) = 103.1 (16.50) beats per minute]. Upon commencement of the simulation, the mean HR of HFenS students rose significantly over baseline [mean (SD) = +11.9 (8.3) beats per minute] compared with LFenS students [mean (SD) = −2.4 (11.6) beats per minute; t25 = 3.679; P < 0.001; Fig. 3]. One-samples t tests confirmed that the average HR of HFenS students was significantly greater than their baseline (t12 = 5.191; P < 0.001) but there was no significant difference over baseline for the LFenS group (t13 = −0.785; P = 0.446).



Back to Top | Article Outline

Perceived Difficulty

The NASA-TLX data suggested that students in both groups perceived the clinical scenario to be most mentally challenging and least physically challenging (Table 1). Of the 6 dimensions, only mental demand (“How mentally demanding did you find the task?”) significantly differed between groups, being rated greater by HFenS than LFenS students (P = 0.013).

Back to Top | Article Outline


In total, 6 students met our criteria for cognitive overload, including 2 from LFenS and 4 from HFenS (Fisher exact test, P = 0.352). The mean (SD) duration of these occurrences was 28.3 (19.2) seconds and did not differ significantly between groups (t4 = 0.189; P = 0.860). As a consequence, these students’ time-to-termination was on average 32.1 seconds longer in comparison to other students (7.7 vs. 7.1 minutes, respectively) but this difference was not statistically significant (t36 = 0.643; P = 0.524). Despite suffering apparent episodes of cognitive overload, 4 of the 6 students went on to successfully revive the patient, including 1 from LFenS and 3 from HFenS (Fisher exact test, P = 0.600). A majority of students (66.7%) identified as suffering cognitive overload revived the patient compared with students who did not suffer cognitive overload (39.4%), but this difference was not statistically significant (Fisher exact test, P = 0.374).

Back to Top | Article Outline

Eye Tracking

Because of hardware malfunction, eye-tracking data were lost for 1 student in the LFenS group. The data for the remaining 38 students can be seen in Table 2. Students in both conditions fixated on the manikin to a similar extent, accounting for approximately one-third of their time-to-termination. Just under another third of each groups’ time was spent attending to the remaining actors and instruments. Similar proportions of time were spent fixating on the confederate in each group (∼5%). The HFenS students spent approximately another 5% attending to the other 3 actors within the room, not present for the LFenS group. However, rather than spending less time in fixation overall, the LFenS group spent nearly 10% more of their time fixating on instruments compared with the HFenS group (t36 = 2.218; P = 0.024). In the LFenS group, the mean (SD) duration of each instrument fixation was 10.2 (4.5) seconds compared with 6.8 (3.5) seconds for the HFenS group (t36 = 2.641; P = 0.012).



Back to Top | Article Outline


Just under half of all students (n = 17; 43%) successfully revived the patient. Fewer students in LFenS (n = 6; 30%) were successful than in HFenS (n = 11; 58%), but this difference did not achieve statistical significance (Fisher exact test, P = 0.076). The continuous measure of time-to-termination indicated that students in LFenS took a mean (SD) of 8.0 (1.8) minutes compared with students in HFenS taking 5.7 (2.5) minutes, a 2.3-minute difference that was statistically significant (t36 = 2.736; P = 0.010). These data were not skewed (−0.251) but suffered from some kurtosis (−1.107). Successful patient recovery generally triggered swifter simulation termination than failure to locate the airway obstruction (8.0 vs. 6.6 minutes; t36 = 2.371; P = 0.023). Therefore, we re-examined the average time-to-termination just for those students in each group who successfully located the airway obstruction. The mean time of the LFenS students (n = 6) was unaffected, still taking a similar mean (SD) of 7.9 (1.7) minutes compared with their unsuccessful LFenS counterparts (t17 = 0.150; P = 0.882). In comparison, successful HFenS students (n = 11) mean (SD) of 4.7 (2.5) minutes, being 1 minute quicker than their unsuccessful HFenS counterparts (t17 = 2.302; P = 0.034), and also remaining significantly quicker than their successful LFenS counterparts (t15 = 2.541; P = 0.023).

Back to Top | Article Outline


The present results provide robust indications that our manipulations of environmental and psychological fidelity were successful. The interviews provided abundant examples of HFenS students being far more distracted, cognitively burdened, rushed, and immersed within the simulations compared with their LFenS counterparts. This was highly congruent with our quantitative measures: the eye-tracking data confirmed that students noticed and attended to the additional stimuli featured in the HFenS; the HR data confirmed that this was associated with HFenS students being significantly more aroused overall than LFenS students; and the NASA-TLX data confirmed that HFenS students rated the simulations as significantly more mentally demanding. The clinical scenario also seems to have been challenging but level appropriate for our sample of early-stage students, demonstrating sufficient sensitivity to discriminate between competent and less competent students, with just under half of the pooled sample successfully “reviving” the patient. All these data consistently point to successful manipulation of our sample of early-stage students, confirming the appropriateness of the experimental paradigm to test our hypotheses.

However, before drawing any firm conclusions from our results, there are several factors to consider regarding our methodology. In terms of sampling, we recruited student volunteers, introducing the possibility of self-selection bias and making it possible that we recruited only the most keen and skilled students to the study. This may limit the generalizability of our results, but because we had a high recruitment rate (80%), we do not judge this risk to be great. It also remains possible that the students in 1 group were more skilled than the other and this alone could explain any differences observed in performance. Because we had no baseline data, we are unable to refute this and acknowledge it as a possibility. However, we deem it improbable because all our students came from a single cohort and we took great pains to ensure that students were randomly assigned to minimize the likelihood of allocation bias.

In terms of our measures, we assumed HR to be an indicator of psychological arousal but there are alternative explanations for these data, such as physical exertion, albeit we were careful to ensure no differences in physical requirements for the 2 groups. We are reassured by the fact that students in both groups rated physical demand to be equally low on the NASA-TLX and, most telling during interviews, HFenS students consistently described cognitive and emotive reactions such as “intense,” “powerful,” and “overwhelming” rather than references to physical exertion. Thus, we can be reasonably confident that there were real differences between our 2 groups with regard to manipulation of psychological fidelity.

Our observation measure of cognitive overload failed to predict performance. There is the very real possibility that our method for developing the criteria, identifying episodes, and/or our underlying assumptions about cognitive overload were flawed. Yet, the measure demonstrated good face validity—students meeting the criteria seemed outwardly to be cognitively burdened to the point of inaction to the researchers, independent judges, and a convenience sample of colleagues. The measure also demonstrated excellent interrater reliability. Yet, it is possible that students’ outward appearances of cognitive overload masked a moment of triggered contemplation allowing for better eventual clinical decision making; students may have experienced self-recognition of cognitive overload and deliberately became inactive to “regroup and gather” their thoughts, which we regard as a commendable clinical behavior. This highlights our need for a more precise definition of “cognitive overload” and reassessment of our underlying assumptions about how it affects performance. One possible improvement in how cognitive overload is measured could be through the “retrospective think aloud” method, whereby participants review their own eye-tracking footage to provide verbal commentary about their cognitions at the time.25

At face value, our measure of time-to-termination was a reasonable indication of performance; the faster students resuscitated the nonbreathing patient the better the prognosis. However, there is a case to be made that it was less a measure of performance and more a reflection of the different motivations at play for the different groups in our study. Data from the interviews suggested that high psychological fidelity engendered a realistic sense of urgency in students. In comparison, low psychological fidelity prompted high assessment anxiety associated with more meticulous behavior in students and less importance placed upon swift treatment. This is corroborated by the eye-tracking data suggesting that LFenS students spent nearly 3 quarters of a minute longer in the careful study of their equipment compared with their HFenS counterparts. Thus, one might argue that our time-to-termination measure was confounded by the differing motivations manipulated between groups and was biased toward the HFenS condition.

Our other measure of performance, patient revival, is a seemingly clear indicator of performance but the difference between groups was not statistically significant (P = 0.076). Our original power analysis to calculate sample size was only based on our continuous but not dichotomous variables. A post hoc analysis of our results, based on the effect size we observed between groups for the dichotomous patient revival measure (Odds Ratio [OR] = 1.93), suggested that our sample size of 39 only provided 44.2% power (ie, was underpowered) to detect a statistically significant difference using Fisher exact test; 88 students would have been required to achieve the traditionally acceptable minimum of 80% power.

Only with such caveats in mind can we examine whether our results supported our hypotheses. On the basis of the progressive continuum of fidelity, we hypothesised that early-stage students would experience greater psychological fidelity in HFenS compared with LFenS (H1), leading to greater cognitive burden (H2) and poorer performance (H3). Our qualitative data clearly suggested that HFenS students experienced greater psychological and cognitive burden than their LFenS counterparts.

Although our data supported H1 and H2, they did not support H3. For the reasons specified previously, it is problematic to suggest that the HFenS group outperformed the LFenS group. However, we are certainly comfortable concluding that entry-level students performing recently learned clinical skills performed no worse with increased environmental and psychological fidelity.

We went to great lengths and were seemingly highly successful at producing cognitive burden in the HFenS group; they were early-stage students who had never participated before in HFSs and they had only learned the necessary skills to succeed in the clinical scenario 3 weeks before. Conventional wisdom associated with the progressive continuum of fidelity suggests that they should have failed miserably (eg, as shown in Refs. 1, 2, and 30). Interpreted through the lens of the CPF, HFenS students rather appeared to be spurred on by the additional cognitive burden, providing them with a level of challenge closer to optimal levels resulting in better performance, whereas LFenS students were not as challenged, resulting in suboptimal performance.

Beaubien and Baker1 strongly recommend avoidance of students being exposed to HFS during the “earliest stages of skill acquisition.” Our results do not necessarily contradict this position because our students had already acquired basic proficiency. However, our data suggest that their impassioned plea may be exaggerating the case; our results clarify that the earliest stages of skill acquisition can still be quite early. The first-year students in our study seemed to have already moved beyond this threshold. Thus, we believe that provided early-stage students have learned the basics of a clinical skill, throwing them in the “deep end” soon after has great educational merit in terms of contextualization, application under duress, and self-reflection. Essentially, we believe that early-stage students possess more resilience than has previously been assumed.

Back to Top | Article Outline


1. Beaubien JM, Baker DP. The use of simulation for training teamwork skills in health care: how low can you go? Qual Saf Health Care 2004; 13(Suppl 1): i51–i56.
2. Maran NJ, Glavin RJ. Low- to high-fidelity simulation—a continuum of medical education? Med Educ 2003; 37(Suppl 1): 22–28.
3. Wright MC, Taekman JM, Endsley MR. Objective measures of situation awareness in a simulated medical environment. Qual Saf Health Care 2004; 13(Suppl 1): i65–i71.
4. Guadagnoli M, Morin MP, Dubrowski A. The application of the challenge point framework in medical education. Med Educ 2012; 46(5): 447–453.
5. Laschinger S, Medves J, Pulling C, et al. Effectiveness of simulation on health profession students’ knowledge, skills, confidence and satisfaction. Int J Evid Based Healthc 2008; 6(3): 278–302.
6. Lapkin S, Levett-Jones T, Bellchambers H, Fernandez R. Effectiveness of patient simulation manikins in teaching clinical reasoning skills to undergraduate nursing students: a systematic review. Clin Nurs Simul 2010; 6(6): e207–e222.
7. Weaver A. High-fidelity patient simulation in nursing education: an integrative review. Nurs Educ Perspect 2011; 32(1): 37–40.
8. Cant RP, Cooper SJ. Simulation-based learning in nurse education: systematic review. J Adv Nurs 2010; 66(1): 3–15.
9. Lee-Hsieh J, Kao C, Kuo C, Tseng HF. Clinical nursing competence of RN-to-BSN students in a nursing concept-based curriculum in Taiwan. J Nurs Educ 2003; 42(12): 536–545.
10. Norman J. Systematic review of the literature on simulation in nursing education. ABNF J 2012; 23(2): 24–28.
11. Rehmann A, Mitman R, Reynolds M. A Handbook of Flight Simulation Fidelity Requirements for Human Factors Research. Technical Report No. DOT/FAA/CTN95/46. Crew Systems Ergonomics Information Analysis Center: Wright-Patterson AFB, OH; 1995.
12. Oser R, Cannon-Bowers J, Salas E, Dwyer D. Enhancing human performance in technology-rich environments: guidelines for scenario-based training. Hum Tech Interact Complex Sys 1999; 9: 175–202.
13. Witmer B, Singer M. Measuring presence in virtual environments: a presence questionnaire. Presence: Teleoperators and Virtual Environments 1998; 7(3): 225–240.
14. Rutten N, van Joolingen W, van der Veen J. The learning effects of computer simulations in science education. Comput Educ 2012; 58(1): 136–153.
15. Goldkuhl G. Pragmatism vs interpretivism in qualitative information systems research. Eur J Info Sys 2012; 21(2): 135–146.
16. Jang DP, Kim IY, Nam SW, Wiederhold BK, Wiederhold MD, Kim SI. Analysis of physiological response to two virtual environments: driving and flying simulation. Cyberpsychol Behav 2002; 5(1): 11–18.
17. Hart S, Staveland L. Development of NASA-TLX (Task Load Index): results of empirical and theoretical research. Adv Psychol 1988; 52: 139–183.
18. Hart S. NASA-Task Load Index (NASA-TLX); 20 years later. Human Factors and Ergonomics Society 50th Annual Meeting; 2006: 904.
19. Moroney W, Biers D, Eggemeier F, Mitchell J. A comparison of two scoring procedures with the NASA task load index in a simulated flight task. Dayton, Ohio USA: Aeronautics and Electronics Conference, NAECON; 1992: 734–740.
20. Agutter J, Drews F, Syroid N, et al. Evaluation of graphic cardiovascular display in a high-fidelity simulator. Anesth Analg 2003; 97(5): 1403–1413.
21. Weinger MB, Reddy SB, Slagle JM. Multiple measures of anesthesia workload during teaching and nonteaching cases. Anesth Analg 2004; 98(5): 1419–1425.
22. Young G, Zavelina L, Hooper V. Assessment of workload using NASA Task Load Index in perianesthesia nursing. J Perianesth Nurs 2008; 23(2): 102–110.
23. Xiao YM, Wang ZM, Wang MZ, Lan YJ. The appraisal of reliability and validity of subjective workload assessment technique and NASA-task load index [in Chinese]. Zhonghua Lao Dong Wei Sheng Zhi Ye Bing Za Zhi 2005; 23(3): 178–181.
24. Finke K, Bublak P, Krummenacher J, Kyllingsbaek S, Muller HJ, Schneider WX. Usability of a theory of visual attention (TVA) for parameter-based measurement of attention I: evidence from normal subjects. J Int Neuropsychol Soc 2005; 11(7): 832–842.
25. Holmqvist K, Nystrom M, Andersson R, Dewhurst R, Joradzka H, Van de Weijer J. Eye Tracking: A Comprehensive Guide to Methods and Measures. Oxford: Oxford University Press; 2011.
26. Australian Resuscitation Council. Guidelines 8, Cardiopulmonary Resuscitation. Available at: Accessed January 01, 2015.
27. Queensland Ambulance Service. Clinical Practice Guidelines—Resuscitation Emergencies. Available at: Accessed January 01, 2015.
28. Bray J, Maxwell S. Multivariate Analysis of Variance. Beverly Hills, CA: Sage; 1985.
29. Roth P. Missing data: a conceptual review for applied psychologists. Personnel Psychol 1994; 47(3): 537–560.
30. Brydges R, Carnahan H, Rose D, Rose L, Dubrowski A. Coordinating progressive levels of simulation fidelity to maximize educational benefit. Acad Med 2010; 85(5): 806–812.

Early-stage students; Paramedicine; Simulation fidelity; Eye tracking; Performance; Cognitive burden

© 2016 Society for Simulation in Healthcare