Journal Logo

Review Articles

The Effectiveness of Medical Simulation in Teaching Medical Students Critical Care Medicine

A Systematic Review and Meta-Analysis

Beal, Matthew David MClinEd; Kinnear, John MMedEd; Anderson, Caroline Rachael MRes; Martin, Thomas David MRes; Wamboldt, Rachel MBBS; Hooper, Lee PhD

Author Information
doi: 10.1097/SIH.0000000000000189
  • Free

Abstract

There is no common medical school curriculum in acute and emergency care,1 and deficiencies in knowledge are common among medical school graduates completing their residency,2–4 who are often responsible for the early assessment and treatment of patients who are acutely ill.3 A review of training in the care of acutely ill patients found medical school training to be suboptimal and to place patients at risk.3 With the current urgent need to relieve pressure on overworked acute care specialties, improving the training and preparation of residents may go some way to addressing the shortage of skilled staff to treat patients safely.5

At some point in medical education, there is a need to refine skills on live patients. However, this must be carefully balanced against the ethical obligation to provide optimal treatment while protecting patients from harm.6 In critical care, this ethical dilemma is intensified because patients are often sedated or have reduced levels of consciousness, which limits their ability to consent to participating in this kind of education. When trainees do actively participate, the opportunity to correct poor technique is limited,7 because training is often opportunistic, with limited chance to build expertise by repeated practice. These are some reasons why “learning by doing” has become less acceptable.8

There is a growing body of evidence for the use of simulation-based medical education,3 which may go some way to mitigating the ethical tensions that arise from using patients as training tools for clinicians.6 Simulation is the process of recreating characteristics of the real world,9 allowing the trainer to carefully control the learning environment and optimize conditions for the skill being taught. This has led the General Medical Council to now recommend that medical schools should use simulation technology in the education of undergraduate medical students.10

Simulation has been shown to have a positive educational impact in a number of health professional groups,11–17 but its effectiveness for the medical student is not clearly defined.18 Reviews have mainly concentrated on postmedical school education, and reviews of simulation in medical students consist of a qualitative narrative synthesis, based on nonsystematic identification of literature.19 The stage of professional development,20 as well as the varying skills being practiced, may influence the effectiveness of the teaching method employed. Cognitive load theory helps to explain how a learner's prior knowledge may affect the efficacy of simulation in medical students compared with higher-level learners. When a learning task is too complex, short-term memory can rapidly become overloaded, which has the effect of inhibiting learning.21

Exposure to simulation during medical school is highly variable, and no studies have investigated an ideal amount of exposure time.18 Simulation is enjoyed by medical students and faculty alike,22,23 suggesting that simulation is effective at Kirkpatrick level 1 (reaction to learning experience). However, its effectiveness at Kirkpatrick levels 2 (knowledge, skill, and attitude acquisition) compared with other teaching methods has been equivocal, with studies reporting no difference, positive or negative effects.22,24,25 This is in contrast to simulation-based medical education in other professional groups and after medical school, which demonstrates moderate to large positive effects.16,26,27

Objectives

The aims of this systematic review and meta-analysis are to assess the effectiveness of simulation for teaching medical students critical care medicine, compared with other teaching methods, and to determine which type of simulation is most effective.

METHODS

The study was undertaken in accordance with a protocol written before the commencement of the review process and published on the PROSPERO database (CRD42013005105).28

Criteria for Selecting Studies for Review

All included studies were randomized controlled trials that assessed the effectiveness of simulation-based teaching compared with other teaching methods, or no teaching, in medical students.

We included studies with teaching interventions directed at critical care, intensive care, anesthetics, emergency medicine, trauma, or prehospital care; studies that used simulation-based teaching interventions, which included the use of high- and low-fidelity mannequins, standardized patients, screen-based computer simulators, and human or animal cadavers; studies, which used outcomes of knowledge or skill-based performance in the care of a critically ill patient; and studies whose comparator group was a different type of simulation technology, a different type of teaching modality, or no teaching (see Table in Supplemental Digital Content 1, http://links.lww.com/SIH/A294, for a list of definitions used for inclusion criteria).

Studies were excluded if participants had already graduated medical school or were health professionals, studies that had nonrandomized designs, those that studied nonacute specialties, or studies that used other types of comparator groups.

Search Methods for Identification of Studies

Studies were identified by systematically searching AMED, EMBASE, MEDLINE, Education Resources Information Centre (ERIC), British Education Index (BEI), and Australian Education Index (AEI) up to July 2013. The search strategy was designed for high sensitivity over precision, to ensure that no relevant studies were lost. The search broadly covered “medical students,” “simulation,” and “acute specialties” (see Table, Supplemental Digital Content 2, http://links.lww.com/SIH/A295, which demonstrates the full search strategy). The reference lists and indexed citations of all included studies were checked for further relevant studies, and authors of included studies were contacted for unpublished literature.

Abstracts of identified studies were independently screened by reviewers M.D.B. and T.D.M. against eligibility criteria. The full texts of potentially eligible studies were obtained and independently screened in full by 2 reviewers (M.D.B. and T.D.M. or R.E.W.). There were no disagreements at any stage.

Maximal data extraction was carried out independently and in parallel by 2 reviewers (M.D.B. and C.R.A. or R.E.W.) using a piloted standard format, which included methodology, participants, outcome measures, and results. Introducing duplicate extraction of all included studies was the only deviation from the protocol, to reduce the risk of reporter bias. Forms were checked for completeness and discrepancies resolved by reviewing the original article. All discrepancies involved missing information, and none were methodological issues or disagreements in interpretation.

Quality Assessment

Individual study quality was assessed using the Cochrane risk of bias assessment tool,29 which assesses the risk of selection bias during random sequence generation and allocation concealment, performance bias through inadequate blinding of participants and personnel, detection bias through inadequate blinding of outcome assessment, attrition bias through incomplete reporting, and reporting bias through the selective reporting of trials. In addition, we considered any other biases, which arose, particularly industry funding by manufacturers of simulation equipment. Authors of studies with unclear risk of bias were contacted for missing information. The quality of evidence for each outcome was assessed using the GRADE framework, which considers the following 5 key elements: study design, indirectness of evidence, unexplained heterogeneity, imprecision of results, and high probability of publication bias.30

Statistical Analysis

We assessed the effectiveness of simulation using outcomes of either knowledge or clinical performance. All of the articles included used continuous outcomes for the measurement of knowledge or clinical performance, including those that used checklists that presented mean checklist scores. Using Cochrane's Review Manager 5.2,31 this mean score was converted to a standardized mean difference (hedges g), which corrects for differences in measurement scales, making the assumption that variability in the standard deviation arises from differences in the continuous scales used to measure the outcome, rather than variability in the population. Although multiple outcomes were assessed in a study, we determined which outcome measure to include in the review using a hierarchy of outcome measures, on the basis of Miller Hierarchy,32 developed by Kinnear who was blinded to the data (Fig. 1).

FIGURE 1
FIGURE 1:
Hierarchy of outcome measures (1 is most preferred, 11 is least preferred)., *Subhierarchy for further content (1) acute coronary syndrome, (2) stroke, (3) asthma, (4) trauma, (5) in-hospital cardiopulmonary resuscitation (CPR), (6) motor-cyclist helmet removal and stiff neck, (7) infant cardiopulmonary resuscitation as first responder, (8) Electrocardiogram attachment and interpretation, (9) intraosseous access, and (10) prehospital CPR with Automated External Defibrillator.

Studies with 2 intervention arms had both arms combined to form a single intervention group for the main analysis using the standard techniques suggested in the Cochrane Handbook (by combining numbers into a single sample size, mean, and standard deviation).29 For the purposes of subgroup analysis, both arms were examined independently. Where data were unavailable, standard deviations were imputed from P values by calculating t values and degrees of freedom to estimate a standard error29 or from confidence intervals (CI) using the calculator in Cochranes Review Manager.31 Paired analysis data from crossover trials were used where there was no evidence of carry-over effect.29

For each outcome, we assessed heterogeneity using the I2 test, where an I2 value of more than 75% is sufficient to indicate evidence of considerable inconsistency.29 In the presence of heterogeneity, we obtained pooled effect estimates and 95% CIs by carrying out an inverse-variance random effects meta-analysis (for all analyses) using the DerSimonian and Laird method in Cochrane's Review Manager 5.2.31,33 This type of analysis adds a weighting for within study and between study variations, which is helpful in the presence of high heterogeneity. Thus, outlier studies receive less weighting than studies that sit around the mean.

We carried out subgroup analyses to investigate the effects of time to outcome assessment, type of outcome assessment, type of simulation, type of control group, duration of simulation, and year of study. Sensitivity analyses were carried out to examine the effect of exclusion of outliers, high risk of bias, industry funding, imputed standard deviations, and crossover trials. Publication bias was examined using funnel plots.

Results were expressed as standardized mean differences, with 95% CIs and P values, and as percentile change derived from Z scores, which demonstrates the percentile group that the average student in the simulation group would be in when compared with students who received the control intervention.

RESULTS

From the electronic searches, we screened 356 abstracts, 326 of which were clearly not relevant, identifying 30 potentially eligible articles. The abstracts of further 482 references and 437 citations were also screened, identifying a further 14 potentially eligible articles. A total of 44 potentially eligible articles were retrieved in full text and assessed in duplicate for inclusion in the review (Fig. 2). Two ongoing studies were also identified (See Table, Supplemental Digital Content 3, http://links.lww.com/SIH/A296, for characteristics of excluded and ongoing studies.)

FIGURE 2
FIGURE 2:
Study flow diagram. RCT indicates randomized control trial; CCM, critical care medicine.

Description of Studies

Twenty-two articles were included in the review (Table 1), including 1325 medical students in their second years and higher, mainly studying at European or North American medical schools. The number of participants in each study ranged from 28 to 144, with a median of 45. Fifteen studies examined high-fidelity simulators, 5 examined low-fidelity simulators, 2 standardized patient simulations, 3 screen-based computer simulators, and 1 study examined a voice advisory mannequin. Eight studies used self-directed learning techniques in their control group [problem-based learning (PBL), case-based discussion, and self-study], 6 used didactic teaching methods (lecture, video, and seminar), 1 used clinical shadowing, and 2 studies used no teaching. For the purposes of the meta-analysis, studies that gave students material to cover in self-study were categorized as having a comparator teaching intervention because this was a guided study. Three studies used low-fidelity simulation in their control group, therefore comparing 2 types of simulation. The median duration of intervention sessions was 2 hours (range, 5 minutes to 3 days). The number of studies do not consistently sum to 22 because several of the studies compared a number of different types of simulation, used a number of different outcome measures, or did not have a nonsimulation comparator group.

TABLE 1
TABLE 1:
Characteristics of Included Studies

Eleven studies used knowledge-based assessments (Kirkpatrick level 2a) including multiple choice questions (MCQs), short answer questions (SAQs), and single best answers (SBAs), whereas 15 studies used skill- and performance-based outcome measures including objective structured clinical examination (OSCE) scores, simulation checklists, and time to action (Kirkpatrick level 2b). Eleven studies also used evaluative questionnaires to assess aspects of satisfaction, and 3 studies used self-efficacy questionnaires to assess participant's confidence (Kirkpatrick level 1). A total of 19 studies assessed students within 1 week of the intervention, and only 3 studies followed participants up after 3 months. There were no studies which assessed for evidence of transfer of learning to clinical practice (Kirkpatrick level 3) or benefit to patients (Kirkpatrick level 4).

The overall risk of bias was high in 7 included studies and unclear in the remaining 15 (Fig. 3). Most studies inadequately reported key risk of bias criteria, making it difficult to precisely judge study quality. We were particularly concerned with studies that did not explain blinding of outcome assessors and randomization procedures (see Figure in Supplemental Digital Content 4, http://links.lww.com/SIH/A297, which demonstrates the full risk of bias assessment for each study.)

FIGURE 3
FIGURE 3:
Risk of bias graph. Review author's judgments across all included studies.

Is Simulation Effective Compared With Other Teaching Methods?

A total of 17 studies compared simulation with other teaching modalities (Fig. 4), reporting knowledge- or skill-based performance measures after the teaching session. However, 1 study reported only median data,37 and in 1 study, participant numbers were unclear,45 so the 15 remaining studies (1000 participants) were included in the analyses. Simulation was significantly more effective than other teaching methods when data were pooled, with an effect size of 0.84 (95% CI = 0.43 to 1.24; P ≤ 0.001; Z = 4.02; I2 = 89%) corresponding to a percentile gain of 49.9 percentiles.

FIGURE 4
FIGURE 4:
The effectiveness of simulation on performance or knowledge scores in medical students (higher scores represent better performance or knowledge).

However, 1 study that reported only medians showed no evidence of improved effectiveness of simulation over other teaching methods [medians, 37 vs. 38 (scale, 0–50; P = 0.263)], respectively.37 The study in which participant numbers were unclear showed no evidence of improved effectiveness of simulation over other teaching methods [SMD = −0.13 (95% CI = −0.72 to 0.47); P = 0.47].45

Sensitivity analyses excluding studies that were at high risk of bias, with imputed standard deviations, industry funded, or of crossover design retained a statistically significant effect. All studies were of a small size and were therefore grouped on the funnel plot, making it ineffective for assessing small study bias (Fig. 5). Despite this, there was some suggestion of asymmetry because studies with high risk of bias were generally smaller and more positive in effect size. While this suggests that small studies with less-positive effects may not have appeared in the literature, sensitivity analysis removing the high-risk studies retained a statistically significant effect (Table 2) and resulted in a symmetrical funnel plot. This suggests that even if it exists, small study or publication bias is of little significance to our overall effect estimates. We carried out subgroup analysis (Table 2) by time to outcome assessment (see Figure in Supplemental Digital Content 5, http://links.lww.com/SIH/A298, which shows that simulation was more effective when assessed at <72 hours), type of outcome assessment (see Figure in Supplemental Digital Content 6, http://links.lww.com/SIH/A299, which shows that simulation was effective in performance-based outcomes but no evidence of effect in knowledge-based outcomes), type of simulation (Fig. 6), duration of simulation (see Figure in Supplemental Digital Content 7, http://links.lww.com/SIH/A300, which shows that simulation was more effective when used for more than 8 hours), and year of study (see Figure in Supplemental Digital Content 8, http://links.lww.com/SIH/A301, which shows that simulation was effective beyond year four of medical school, but no evidence before this), type of control group (see Figure in Supplemental Digital Content 9, http://links.lww.com/SIH/A302, which shows that simulation was more effective than dependent and independent teaching techniques, but no significant effect compared with self-study). Subgrouping did not explain the heterogeneity (Table 2).

FIGURE 5
FIGURE 5:
Funnel plot assessing risk of publication bias (SMD vs. standard error of the SMD).
TABLE 2
TABLE 2:
Subgrouping and Sensitivity Analyses
FIGURE 6
FIGURE 6:
The effectiveness of different types of simulation——direct and indirect analyses. Right-hand side of forest plot (1) and 2 left-hand side of forest plot (2).

Two studies (78 participants) compared simulation with no teaching. The effect size was 3.41 (95% CI = −2.57 to 9.40; P = 0.26; Z = 1.12), which was not significant, corresponded to a gain of 36.9 percentiles, and was significantly heterogeneous (I2= 98%; see Figure in Supplemental Digital Content 10, http://links.lww.com/SIH/A303, which demonstrates no evidence of effect compared with no teaching).

Which Type of Simulation Is Most Effective?

We examined studies that directly compared different types of simulation teaching (Fig. 6). Three studies (173 participants) compared high-fidelity simulation with low-fidelity simulation. However, one of these studies did not present mean data, and the remaining 2 studies (130 participants) that were included in the meta-analysis favored high-fidelity simulation, with an effect size of 1.00 (95% CI = 0.63 to 1.37; P < 0.001).38,43 The other study (43 participants) favored low-fidelity simulation more than high-fidelity simulation, with a median (interquartile range) of 29 (29–30) and 26 (25–28), respectively (P = 0.03).42 One study (48 participants) compared high-fidelity simulation with standardized patients and found no evidence of a difference with an effect size of 0.43 (95% CI = −0.14 to 1.01; P ≥ 0.05).35 One study (28 participants) compared low-fidelity simulation with screen-based computer simulators and found no evidence of a difference, with an effect size of −0.11 (95% CI = −0.85 to 0.63; P = 0.77).36

Comparisons were also made between types of simulation by subgrouping studies that compared types of simulation with other teaching methods. Twelve studies (797 participants) reported the use of high-fidelity patient simulators [effect size, 0.90 (95% CI = 0.48 to 1.31; P < 0.001; Z = 4.25; I2= 86%)] corresponding to a gain of 50.0 percentiles. Two studies (121 participants) reported the use of screen-based computer simulators, with no evidence of an effect [effect size, −0.07 (95% CI = −1.17 to 1.04; P = 0.91; Z = 0.12)]. Two studies (87 participants) reported the use of low-fidelity simulators, with no evidence of an effect [effect size, 1.39 (95% CI = −0.95 to 3.74; P = 0.24; Z = 1.17)]. One study (46 participants) reported the use of standardized patients [effect size, 1.94 (95% CI = 1.23 to 2.65; P < 0.001; Z = 5.34)]. The results of both the direct and indirect subgrouped comparisons were resistant to sensitivity analysis that excluded studies with high risk of bias, industry funding, imputed standard deviations, and crossover design.

According to the GRADE criteria (Table 3), the quality of the evidence for simulation against other teaching methods was moderate. The GRADE assessment was downgraded twice to account for the unclear risk of bias across all studies and the unexplained inconsistency indicated by statistically significant heterogeneity. However, the GRADE assessment was upgraded once for the large and practically important effect size that was resilient to sensitivity analysis.

TABLE 3
TABLE 3:
Grade Evidence Profile

DISCUSSION

Our review suggests that simulation-based medical education is more effective for teaching critical care medicine to students than other teaching methods. The size of the effect is large (0.84) according to Cohen54 who categorizes effects of less than 0.2 as small, 0.2 to 0.8 as moderate, and greater than 0.8 as large. However, this interpretation should be used with caution55 because in education, even small effect sizes have been shown to be important in policy decision making.56 Using Z scores to calculate the percentile change, we observed an increase of 49.8 percentiles in the simulation group compared with the other teaching groups. This means that the average student in the simulation group would be in the 99.8th percentile of the control group. Considering a median simulation duration of just 2 hours, we considered this to be a large and practically important effect.

This review builds on a growing body of evidence across a range of healthcare professions. A systematic review by Cook et al13 demonstrated that simulation is effective in postgraduate nurses for knowledge and skill acquisition, with an effect size of 1.20 (95% CI = 1.04 to 1.35) and 1.09 (95% CI = 1.03 to 1.16), respectively. A systematic review by Yuan et al20 also demonstrated that simulation is effective in other health professionals for knowledge and skill acquisition, with an effect size of 0.53 (95% CI = 0.16 to 0.90; P = 0.006) and 1.15 (95% CI = 0.78 to 1.52; P < 0.001), respectively. A systematic review by McGaghie et al16 found that simulation is effective for clinical skill acquisition across a range of medical seniorities, with an effect size of 0.71 (95% CI = 0.65 to 0.76; P < 0.001). A systematic review by Lorello et al17 found simulation to be more effective in anesthesiology training across a number of seniorities, with an effect size range of 0.60 to 1.05. The largest systematic review by Ilgen et al,26 which incorporated a range of professions and stages of development, found no evidence of an effect for simulation compared with other teaching modalities for knowledge and skills, with an effect size of 0.26 (95% CI = −0.08 to 0.60; P = 0.14) and 0.19 (95% CI = −0.10 to 1.23; P = 0.21), respectively. Although there is some evidence of inconsistency among the existing systematic reviews, this is unsurprising given the differences between participants. Our pooled effect estimates are statistically consistent with these other studies and demonstrate similar effect sizes to those of Cook et al,12,13 Yuan et al,20 and McGaghie et al.16 This is the first systematic review to describe the effectiveness of simulation for teaching critical care medicine in the medical school setting.

It is perhaps unsurprising that simulation is more effective than other teaching modalities in improving performance-related outcomes (Kirkpatrick 2b) because it is a performance-based method of learning. However, despite adequate power (>0.99), we found no evidence that simulation was more effective than other teaching modalities in preparing for knowledge-based assessments (Kirkpatrick 2a). This is an important finding because simulation-based teaching is a resource- and faculty-intensive education technique, which has significant cost implications. Maximizing its cost-benefit impact will require defining of the optimal context in which simulation should be used, and this study helps define that position. The finding is in contrast to the findings of reviews in other trainee groups, which suggests that the type of knowledge or skills gained relates to the level of expertise of the learner.13,20 Cognitive load theory57 and the challenge point framework58 provide conceptual frameworks to help explain how simulation may impact differently on learning depending on previous level of knowledge. It is therefore important to separate undergraduate from postgraduate learner cohorts when defining the effectiveness of different learning methods and different types of simulation within medical education.

Our finding may go some way to supporting the theory that simulation promotes the transition of knowledge (“knows”) into reasoned action (“does”),24,44 which would help explain why we were unable to demonstrate any effect in knowledge-based outcomes. This would therefore support the view that simulation is best used as an adjunct to other teaching methods in the undergraduate curriculum, rather than as a stand-alone method. We would postulate that simulation would be best placed alongside PBL and didactic teaching methods in integrated curricula or following in traditional domain-centered curricula.

This study also demonstrated that high-fidelity simulation and the use of standardized patients were more effective than other teaching modalities, but that there was no evidence of effectiveness for low-fidelity simulation or screen-based computer simulation compared with other teaching modalities. We found that in direct comparisons, high-fidelity simulation was more effective than low-fidelity simulation, which is in contrast to a number of studies in other groups, which showed no difference in their efficacy.59 This finding was not well explained by duration of simulation exposure and is difficult to interpret because the term “fidelity” is not used consistently by all researchers, which may variously refer to environmental, functional, or psychological fidelity.60

Although we demonstrated that simulation was more effective than lectures, problem-based learning, and other similar techniques when pooled, we could find no evidence of a difference when comparing simulation with independent study or no teaching. This finding is counterintuitive, in that if simulation is effective compared with other teaching methods, it would be expected to be more effective than no teaching. This analysis was, however, limited to only 2 studies, which had significantly heterogeneous results, so this result may be due to an outlier study. The study by Ali et al34 showed a large and significant effect comparing simulation with no teaching. The study by Hansel et al41 showed no evidence of effect comparing simulation with no teaching, and they postulated that the scenarios they used may have been too complex for their participant group, further supporting the view that simulation may not always be effective in this learner group.

The evidence supports the use of simulation for teaching critical care medicine to medical students. However, this review has been unable to address differences between types of simulation technology, the effect of duration or frequency of simulation teaching (the “dose” of simulation), the optimal timing by year of study, or retention of skills postsimulation. Further work is also needed to categorize the cost effectiveness of simulation-based teaching, because equipment and operational costs are high.61

Limitations

Despite a thorough literature search using prespecified criteria and a protocol designed according to methods specified in the Cochrane Handbook,29 there are limitations to the study. Reviewers were nonblinded throughout the study, which may have biased coding and interpretation of data. However, we felt that this was unlikely given the high levels of agreement.

Most of the studies, which used skill or behavioral-based outcome measures during simulated patient scenarios, used the same simulators during the assessment as in the teaching session. This may be considered an important source of bias because the simulation group has the advantage of being assessed on the same simulator used for training. All but 1 study carried out at least 1 orientation session on the simulator for all intervention groups. Our effect size was resilient to the removal of these studies from the meta-analysis and maintained statistical significance.

Another issue was that many requests for further information from authors of the included studies went unanswered, which meant that analysis was limited for a large number of studies. This forced us to include studies with an unclear risk of bias in the meta-analysis, when these studies may have been more appropriately rated as having low or high risk of bias with the additional information. Studies assessed to be at high risk of bias generally had larger and more positive effect sizes24,44,46,48 than those with low risk of bias. However, most of the included studies favored simulation, and our effect size was resistant to removal of these studies from the meta-analysis.

Furthermore, we identified significant heterogeneity, which we were unable to explain through subgroup and sensitivity analyses, suggesting that the results are limited by the quantity and quality of original articles identified. Despite high correlation in measured effects among studies, responsiveness of the different outcome measures used may vary substantially between studies, which may result in significant heterogeneity and therefore a biased meta-analysis. In consistency is a common problem in quantitative educational research, which has led some to argue that qualitative methods are more suited to this domain.62 The use of standardized and validated outcome measures would go some way to helping prevent this inconsistency in future research. Despite the inconsistency in effect size, most included studies favored simulation, with only a small number favoring the control interventions.

CONCLUSIONS

This systematic review and meta-analysis provide moderate evidence that simulation is effective for teaching critical care medicine to medical students, yielding large favorable benefits over other teaching methods despite relatively short simulation sessions. High-fidelity simulation seems to be more effective than low-fidelity simulation. Simulation was particularly effective in preparing students for clinical performance–based rassessments but not for knowledge-based assessments. However, whether this translates into improved performance in the authentic clinical setting is unproven.

This review is important for medical educators who are responsible for teaching acute care clinical skills to medical students and are faced with a panoply of educational techniques on the one hand and a finite budget on the other. The findings also support an educational method that may go some way to mitigating the ethical tensions that arise through teaching critical care medicine to undergraduates. Further high-quality research is needed to determine the best way to integrate simulation into undergraduate curriculums, which should also address the broader questions of when, how, and why simulation works.

ACKNOWLEDGMENTS

The authors thank Dr. Lesley Bowker and Professor Sam Leinster for commenting on the drafts and providing valuable feedback and discussion of our findings and conclusions and Dr Jan Wong for her help in screening a full text in the non-English language on our behalf.

The authors also thank the following authors of both included and excluded studies for their responses to our requests for unpublished literature and for further information: Adel Bassily-Marcus, Mount Sinai School of Medicine; Peyman Benharash and Paul Frank, UCLA; Ester Coolen, Radboud University; Rosemarie Fernandez, University of Washington School of Medicine; Mike Gilbart, University of British Columbia; James Gordon, Harvard Medical School; Bruce Lo, Eastern Virginia Medical School; Pamela Morgan, University of Toronto; Gavin Perkins, Warwick Medical School; and Raymond Ten-Eyck, Boonshoft School of Medicine.

REFERENCES

1. Shen J, Joynt GM, Critchley LA, Tan IK, Lee A. Survey of current status of intensive care teaching in English-speaking medical schools. Crit Care Med 2003;31:293–298.
2. Smith GB, Poplett N. Knowledge of aspects of acute care in trainee doctors. Postgrad Med J 2002;78:335–338.
3. Smith CM, Perkins GD, Bullock I, Bion JF. Undergraduate training in the care of the acutely ill patient: a literature review. Intensive Care Med 2007;33:901–907.
4. Jensen ML, Hesselfeldt R, Rasmussen MB, et al. Newly graduated doctors' competence in managing cardiopulmonary arrests assessed using a standardized Advanced Life Support (ALS) assessment. Resuscitation 2008;77:63–68.
5. O'Dowd A. Locums make up a fifth of doctors in emergency units at weekends. BMJ 2013;346:f1065.
6. Ziv A, Wolpe PR, Small SD, Glick S. Simulation-based medical education: an ethical imperative. Acad Med 2003;78:783–788.
7. Phillips PS, Nolan JP. Training in basic and advanced life support in UK medical schools: questionnaire survey. BMJ 2001;323:22–23.
8. Vozenilek J, Huff JS, Reznek M, Gordon JA. See one, do one, teach one: advanced technology in medical education. Acad Emerg Med 2004;11:1149–1154.
9. Beaubien JM, Baker DP. The use of simulation for training teamwork skills in health care: how low can you go? Qual Saf Health Care 2004;13:i51–i56.
10. GMC. Tomorrow's Doctors. London: General Medical Council; 2009.
11. Issenberg SB, Mcgaghie WC, Petrusa ER, Lee Gordon D, Scalese RJ. Features and uses of high-fidelity medical simulations that lead to effective learning: a BEME systematic review. Med Teach 2005;27:10–28.
12. Cook DA, Erwin PJ, Triola MM. Computerized virtual patients in health professions education: a systematic review and meta-analysis. Acad Med 2010;85:1589–1602.
13. Cook DA, Hatala R, Brydges R, et al. Technology-enhanced simulation for health professions education: a systematic review and meta-analysis. JAMA 2011;306:978–988.
14. Cooper S, Cant R, Porter J, et al. Simulation based learning in midwifery education: a systematic review. Women Birth 2012;25:64–78.
15. Lynagh M, Burton R, Sanson-Fisher R. A systematic review of medical skills laboratory training: where to from here? Med Educ 2007;41:879–887.
16. McGaghie WC, Issenberg SB, Cohen ER, Barsuk JH, Wayne DB. Does simulation-based medical education with deliberate practice yield better results than traditional clinical education? A meta-analytic comparative review of the evidence. Acad Med 2011;86:706–711.
17. Lorello GR, Cook DA, Johnson RL, Brydges R. Simulation-based training in anaesthesiology: a systematic review and meta-analysis. Br J Anaesth 2014;112:231–245.
18. Heitz C, Eyck RT, Smith M, Fitch M, Ander D. Simulation in medical student education: survey of clerkship directors in emergency medicine. West J Emerg Med 2011;12:455–460.
19. Chakravarthy B, Ter Haar E, Bhat SS, McCoy CE, Denmark TK, Lotfipour S. Simulation in medical school education: review for emergency medicine. West J Emerg Med 2011;12:461–466.
20. Yuan HB, Williams BA, Fang JB, Ye QH. A systematic review of selected evidence on improving knowledge and skills through high-fidelity simulation. Nurse Educ Today 2012;32:294–298.
21. Sweller J. Cognitive load theory, learning difficulty, and instructional design. Learn Instr 1994;4:295–312.
22. Morgan PJ, Cleave-Hogg D, Desousa S, Lam-Mcculloch J. Applying theory to practice in undergraduate education using high fidelity simulation. Med Teach 2006;28:e10–e15.
23. Morgan P, Cleave-Hogg D. A Canadian simulation experience: faculty and student opinions of a performance evaluation study. Br J Anaesth 2000;85:779–781.
24. Gordon JA, Shaffer DW, Raemer DB, Pawlowski J, Hurford W, Cooper J. A randomized controlled trial of simulation-based teaching versus traditional instruction in medicine: a pilot study among clinical medical students. Adv Health Sci Educ Theory Pract 2006;11:33–39.
25. Kim JH, Kim WO, Min KT, Yang JY, Nam YT. Learning by computer simulation does not lead to better test performance than textbook study in the diagnosis and treatment of dysrhythmias. J Clin Anesth 2002;14:395–400.
26. Ilgen JS, Sherbino J, Cook DA. Technology-enhanced simulation in emergency medicine: a systematic review and meta-analysis. Acad Emerg Med 2013;20:117–127.
27. Cook DA, Brydges R, Hamstra SJ, et al. Comparative effectiveness of technology-enhanced simulation versus other instructional methods: a systematic review and meta-analysis. Simul Healthc 2012;7:308–320.
28. Beal M, Hooper L. The effectiveness of medical simulation in teaching medical students critical care medicine: a protocol for a systematic review. PROSPERO: International prospective register of systematic reviews. 2013. CRD42013005105. Available from http://www.crd.york.ac.uk/PROSPERO/display_record.asp?ID=CRD42013005105. Accessed August 17, 2016.
29. Higgins JPT, Green S. eds. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0: [updated March 2011]. The Cochrane Collaboration, 2011. Available from http://www.handbook.cochrane.org. Accessed August 17, 2016.
30. Atkins D, Best D, Briss PAGRADE working group. Grading quality of evidence and strength of recommendations. BMJ 2004;328:1490.
31. The Nordic Cochrane Centre. Review Manager (RevMan). Copenhagen: The Cochrane Collaboration; 2012.
32. Miller GE. The assessment of clinical skills/competence/performance. Acad Med 1990;65:S63–S67.
33. DerSimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials 1986;7:177–188.
34. Ali J, Adam RU, Sammy I, Ali E, Williams JI. The simulated trauma patient teaching module-does it improve student performance? J Trauma 2007;62:1416–1420.
35. Ali J, Al Ahmadi K, Williams JI, Cherry RA. The standardized live patient and mechanical patient models—their roles in trauma teaching. J Trauma 2009;66:98–102.
36. Bonnetain E, Boucheix JM, Hamet M, Freysz M. Benefits of computer screen-based simulation in learning cardiac arrest procedures. Med Educ 2010;44:716–722.
37. Cavaleiro AP, Guimarães H, Calheiros F. Training neonatal skills with simulators? Acta Paediatr 2009;98:636–639.
38. Coolen EH, Draaisma JM, Hogeveen M, Antonius TA, Lommen CM, Loeffen JL. Effectiveness of high fidelity video-assisted real-time simulation: a comparison of three training methods for acute pediatric emergencies. Int J Pediatr 2012;2012:709569.
39. Curran VR, Aziz K, O'Young S, Bessell C. Evaluation of the effect of a computerized training simulator (ANAKIN) on the retention of neonatal resuscitation skills. Teach Learn Med 2004;16:157–164.
    40. Gilbart MK, Hutchison CR, Cusimano MD, Regehr G. A computer-based trauma simulator for teaching trauma management skills. Am J Surg 2000;179:223–228.
      41. Hansel M, Winkelmann AM, Hardt F, et al. Impact of simulator training and crew resource management training on final-year medical students' performance in sepsis resuscitation: a randomized trial. Minerva Anestesiol 2012;78:901–909.
      42. Isbye DL, Høiby P, Rasmussen MB, et al. Voice advisory manikin versus instructor facilitated training in cardiopulmonary resuscitation. Resuscitation 2008;79:73–81.
      43. Lo BM, Devine AS, Evans DP, et al. Comparison of traditional versus high-fidelity simulation in the retention of ACLS knowledge. Resuscitation 2011;82:1440–1443.
      44. McCoy CE, Menchine M, Anderson C, Kollen R, Langdorf MI, Lotfipour S. Prospective randomized crossover study of simulation vs. didactics for teaching medical students the assessment and management of critically ill patients. J Emerg Med 2011;40:448–455.
      45. Morgan PJ, Cleave-Hogg D. Comparison between medical students' experience, confidence and competence. Med Educ 2002;36:534–539.
      46. Ruesseler M, Weinlich M, Müller MP, Byhahn C, Marzi I, Walcher F. Simulation training improves ability to manage medical emergencies. Emerg Med J 2010;27:734–738.
      47. Schwartz LR, Fernandez R, Kouyoumjian SR, Jones KA, Compton S. A Randomized comparison trial of case-based learning versus human patient simulation in medical student education. Acad Emerg Med 2007;14:130–137.
        48. Steadman RH, Coates WC, Huang YM, et al. Simulation-based training is superior to problem-based learning for the acquisition of critical assessment and management skills. Crit Care Med 2006;34:151–157.
        49. Tan GM, Ti LK, Tan K, Lee T. A comparison of screen-based simulation and conventional lectures for undergraduate teaching of crisis management. Anaesth Intensive Care 2008;36:565–569.
          50. Ten Eyck RP, Tews M, Ballester JM. Improved medical student satisfaction and test performance with a simulation-based emergency medicine curriculum: a randomized controlled trial. Ann Emerg Med 2009;54:684–691.
            51. Ten Eyck RP, Tews M, Ballester JM, Hamilton GC. Improved fourth-year medical student clinical decision-making performance as a resuscitation team leader after a simulation-based curriculum. Simul Healthc 2010;5:139–145.
              52. Wenk M, Waurick R, Schotes D, et al. Simulation-based medical education is no better than problem-based discussions and induces misjudgment in self-assessment. Adv Health Sci Educ Theory Pract 2009;14:159–171.
                53. Yang LY, Yang EJ, Ying LL, et al. The use of human patient simulator in enhancing medical students understanding of crisis recognition and resuscitation. Int Med J 2010;17:209–211.
                  54. Cohen J. Statistical Power Analysis for the Behavioral Sciencies. 2nd ed. Hillsdale, NJ: Erlbaum; 1988.
                  55. Durlak JA. How to select, calculate, and interpret effect sizes. J Pediatr Psychol 2009;34:917–928.
                  56. Hedges LV, Hedberg EC. Intraclass correlation values for planning group-randomized trials in education. Educ Eval Policy Anal 2007;29:60–87.
                  57. Van Merriënboer JJ, Sweller J. Cognitive load theory in health professional education: design principles and strategies. Med Educ 2010;44:85–93.
                  58. Guadagnoli M, Morin M-P, Dubrowski A. The application of the challenge point framework in medical education. Med Educ 2012;46:447–453.
                  59. Norman G, Dore K, Grierson L. The minimal relationship between simulation fidelity and transfer of learning. Med Educ 2012;46:636–647.
                  60. Maran NJ, Glavin RJ. Low- to high-fidelity simulation - a continuum of medical education? Med Educ 2003;37:22–28.
                  61. Morgan PJ, Cleave-Hogg DM. Cost and resource implications of undergraduate simulator-based education. Can J Anaesth 2001;48:827–828.
                  62. Hoepfl MC. Choosing qualitative research: a primer for technology education researchers. J Tech Educ 1997;9:47–63.

                  APPENDIX: PRISMA CHECKLIST

                  Table
                  Table:
                  No title available.
                  Keywords:

                  Education; Medical; Medical education; Clinical education; Medical students; Student doctors; Trainee doctors; Medical school; Undergraduate doctors; Undergraduate medical education; Undergraduate education; Simulation; Simulator; Medical simulation; Patient simulation; Meta-analysis; Review; Systematic review; Critical care; Intensive care; Life support; Advanced Life Support; Resuscitation

                  Supplemental Digital Content

                  © 2017 Society for Simulation in Healthcare