Journal Logo

Review Articles

Efficacy of Serious Games in Healthcare Professions Education

A Systematic Review and Meta-analysis

Maheu-Cadotte, Marc-André RN, PhD(c); Cossette, Sylvie RN, PhD; Dubé, Véronique RN, PhD; Fontaine, Guillaume RN, MSN, PhD(c); Lavallée, Andréane RN, PhD(c); Lavoie, Patrick RN, PhD; Mailhot, Tanya RN, PhD; Deschênes, Marie-France RN, MSN, PhD(c)

Author Information
Simulation in Healthcare: The Journal of the Society for Simulation in Healthcare: June 2021 - Volume 16 - Issue 3 - p 199-212
doi: 10.1097/SIH.0000000000000512

Abstract

Serious games (SGs) are interactive and entertaining software designed primarily with an educational purpose. Popularized in the beginning of the 2000s, SGs quickly became integrated into healthcare professions education as their entertaining factor showed the potential to engage learners and support their learning process.1 Serious games are thought to fulfill the needs of adult learners, such as autonomy, control, and a sense of achievement.2,3 Moreover, authors report that learners are receptive to the visual and the interactive aspects of SGs, traditionally associated with video games.4 Thus, their use by healthcare educators is expected to rise in both initial and continuing education.5

Learning in SGs typically occurs through a gameplay experience that combines challenges with various playful design elements, which can be seen as features or building blocks of SGs (eg, scoring system, content unlocking, integration of a storyline).6 Challenges allow learners to be actively involved in a decision-making process for which they can receive immediate feedback and see the results of their decisions.7 For example, an SG can challenge learners to provide the correct management plan for a virtual patient. Points can be awarded, and learners can unlock a new game level if they provide the correct management plan. As such, SGs are often associated with a constructivist learning perspective where the learning progression is fueled by an interaction cycle between learners and the SG and where the reception of feedback allows learners to reflect on new or better ways to take on a challenge.6 Thus, one of the main objectives in designing SGs is to ensure that they support learners' engagement to take on the various challenges that are expected to lead to significant learning outcomes.8

Engagement can be defined as a bidimensional concept: a behavioral dimension (ie, the extent of the learners' involvement while taking on the challenges; eg, the total amount of time invested by learners' in the SG) and an experiential dimension (ie, the subjective experience of the learners while taking on the challenges; eg, learners' affect while using the SG).9 Proper integration of the challenges with the playful design elements while designing SGs can ensure that learners remain engaged toward the challenges they take on.10

Systematic reviews on the use of SGs in the healthcare professions report that their efficacy in supporting engagement and improving learning outcomes varies greatly.1,11–15 However, reasons as to why SGs produce heterogeneous results have been left unexplored. Heterogeneity is often the product of diversity in the combination of studies with different research designs, populations, intervention designs, comparators, or outcomes evaluated.16 Previous reviews of SGs combined the findings of quasi-experimental studies with experimental ones, which could have induced biases in the results reported through the lack of a control group or randomization.12,13 Other reviews combined studies evaluating SGs to ones evaluating gamification interventions (ie, the application of gaming elements to nongaming contexts) or commercial off-the-shelf games (ie, games designed especially for entertainment but used for educational purpose), which could have induced heterogeneity in the results reported because of the different design of each of these interventions.14,15

Thus, in this systematic review, we focused on identifying, appraising, and synthesizing the results of experimental studies evaluating the efficacy of SGs on engagement and learning outcomes in healthcare education. Because the development of an SG can be expensive and time consuming,14 findings from this review will thus provide guidance to educators regarding the design and the adoption of SGs and to researchers in the conduct of future works.

METHODS

Protocol and Registration

This systematic review was based on the Cochrane Handbook for Systematic Reviews of Interventions.16 We report this systematic review according to the Preferred Reporting Items for Systematic review and Meta-Analysis (PRISMA) standards.17 We prospectively registered (#CRD42017077424) and published the detailed review protocol.18

Eligibility Criteria

We included randomized controlled trials (RCTs), cluster RCTs, and crossover RCTs published in English or in French from January 1, 2005, to April 24, 2019. An SG had to be assessed, as a standalone intervention or as part of a multicomponent intervention, among healthcare professionals or students, from any level of education, either in an initial or a continuing education setting. For the purpose of this review, we defined SGs as interactive and entertaining software with a primary educational purpose that engage learners through challenges.19–22 All types of comparator interventions were considered for inclusion. Studies had to report at least 1 measure of a learning outcome or 1 measure of engagement—behavioral (ie, the duration of the educational intervention usage) or experiential (ie, self-reported measures of learners' experience in using the educational intervention). Learning outcomes were defined after Kirkpatrick's model.23 We considered all short- and long-term measures of knowledge acquisition, skill development (subdivided as confidence in skills, cognitive skills, and procedural skills), attitude and behavior change, as well as clinical outcomes in healthcare system users.

Information Sources and Search

A librarian searched 6 bibliographical databases using key words and MeSH terms related to the following: SGs [eg, serious game(s), game-based learning/training, applied game(s)] healthcare professionals/healthcare students [eg, physician(s), clinician(s), trainee(s)], and effect on engagement and learning outcomes (eg, efficacy, skills development, knowledge acquisition). These bibliographical databases were as follows: Cumulative Index of Nursing and Allied Health (EBSCO), EMBASE (OVID), ERIC (ProQuest), PsycINFO (APA PsycNET), PubMed (NCBI), and Web of Science – SCI and SSCI (ISI – Thomson Scientific). We performed an initial search in these databases on December 13, 2017, and we updated our search on April 24, 2019 (see Text, Supplementary Digital Content 1, https://links.lww.com/SIH/A572, all search strategies are reported). To find additional articles, hand searching was performed in scientific journals specialized in SGs (Games for Health Journal, Games, G|A|M|E The Italian Journal of Game Studies, International Journal of Computer Games Technology, International Journal of Serious Games, and JMIR Serious Games), in previous systematic reviews,13,24 and in the reference lists of identified studies.

Identified references were imported and managed in EndNote (Version X8; Clarivate Analytics). We screened all references independently and in pairs, and all disagreements were resolved through discussion with a third author.

Data Extraction Process

We performed the data extraction process by using the Effective Practice and Organization of Care template.25 The extraction form was piloted by all review authors involved in this step using a single article. Authors then met to discuss issues they might have had while using the form. As no significant disparity was found between forms during the piloting phase, one author performed the initial data extraction and another one validated it.

Data Items

For descriptive purposes, we extracted the following items: study aim; study design; population; attrition rate; name of SGs evaluated; theoretical framework used for the SGs development; cost and duration of the SGs development; clinical topics addressed; methods of delivery of the comparator intervention (ie, classroom learning, written material, e-learning, another SG, simulation/virtual simulation); duration and frequency of use of the interventions; unit of measurement; time points measured; instruments; and validity and reliability of the instruments.

For quantitative synthesis purposes, we extracted the following items: sample size; outcome data; and risk of bias data.

Assessment of Risk of Bias

Two authors independently assessed the risk of bias of each included study using the Cochrane Collaboration's tool for assessing risk of bias,26 and all disagreements were resolved with the help of a third author. A high risk of bias diminishes the reliability of the study results. The following aspects are considered during assessment: random sequence generation, allocation concealment, measurement of study group characteristics and baseline outcomes, incomplete outcome data, blinding, contamination, and selective outcome reporting. For each criterion, we judged studies at “low risk,” “high risk,” or “unclear risk” of bias. We considered studies at high risk of bias if they were judged at high or unclear risk of bias on either of these 3 criteria: randomization sequence generation, allocation concealment, or blinding of assessors to participants' group assignment as these criteria are likely to significantly bias the results.27

Assessment of Selective Reporting of Outcomes

We compared the outcomes reported in the articles with the outcomes reported in the research protocol or, if no protocol was available, with the trial prospective registration form. If the trial was not prospectively registered, we compared the outcomes presented in the methods section with the ones reported in the results section.

Assessment of Reporting Biases

We constructed a funnel plot in RevMan 5.3 (The Cochrane Collaboration, Copenhagen, Denmark) and visually inspected it to assess reporting biases (eg, due to publication, language, or citation biases) at the body of literature level. We considered an asymmetrical funnel plot at visual inspection as an indicator of reporting biases.

Data Synthesis

Efficacy of SGs in Supporting Behavioral and Experiential Engagement

To evaluate the efficacy of SGs in supporting behavioral engagement, we used meta-analytical methods to compare the duration of SG use versus the comparator intervention use. All meta-analyses in this systematic review were performed in RevMan 5.3 (The Cochrane Collaboration, Copenhagen, Denmark) using an inverse variance approach with random-effect models to combine continuous data. At least 2 studies had to contribute to a single meta-analysis for it to be conducted.28 No minimal number of participants was required. All results are expressed with 95% confidence intervals (CIs), and statistically significant results are defined as a 2-sided α level of 0.05.

Regarding the efficacy of SGs to support behavioral engagement, the result is expressed as a mean difference (in minutes). Moreover, we compared narratively the expected frequency and duration of SG use, according to study authors, to the observed ones.

Regarding experiential engagement, this concept encompasses many aspects of the learners' experience with SGs, and we expected that authors would measure a diverse set of these aspects. As such, we used an analysis approach where we let these aspects emerged from the data (and not from prespecified categories). This allowed us both to identify all aspects of experiential engagement that were evaluated in included studies, and to compare between studies the results obtained regarding each identified aspect. First, we extracted the items composing each instrument used to assess and to compare the experiential engagement in learners between groups. Second, 2 authors independently analyzed and categorized all items into aspects that were refined iteratively through the data analysis process. Third, the proposition of each author was contrasted to reach a consensus on aspects of experiential engagement that were measured. The efficacy of SGs on experiential engagement was finally evaluated for each aspect.

Efficacy of SGs in Improving Learning Outcomes

We conducted meta-analyses to evaluate the efficacy of SGs compared with any other educational intervention in improving learning outcomes. Meta-analyses included all studies with enough data to compute a standardized mean difference (SMD) regarding at least 1 outcome (ie, posttest means, medians, or odds ratios; standard deviations, first and third quartiles, standard errors, P values, or t value; number of participants in each group).

We also performed meta-analyses of studies evaluating the efficacy of SGs versus passive comparators. However, the emphasis was kept on studies with active education comparisons as educators usually seek to find the best educational intervention and not if an educational intervention is better than doing nothing. As such, results of these analyses are not reported here but online (see Figures, Supplementary Digital Content 2, https://links.lww.com/SIH/A573, meta-analyses are presented there).

Subgroup and Sensitivity Analyses

Statistical heterogeneity was assessed using the I2 statistic. A value superior to 50% was considered as a high level of heterogeneity for all meta-analyses. We explored statistical heterogeneity by performing subgroup and sensitivity analyses. Subgroup analyses were conducted regarding the study population (ie, healthcare professionals vs. healthcare students), the comparator intervention (ie, classroom-learning, written material, e-learning, simulation or virtual simulation, or a nonactive comparator), and the publication year. As for this last subgroup analysis, we prospectively retained 2014 as a cutoff year (ie, before or in 2014 vs. after 2014) as the New Media Consortium declared that year that SGs were to be greatly developed and evaluated by educational institutions in the next 2 to 3 years.29 Sensitivity analyses were performed to restrict meta-analyses to studies with larger sample sizes. Smaller studies tend to be associated with larger standard errors and different intervention effects, which could introduce statistical heterogeneity.16,30,31 We considered a study “small” if its sample size fell under the first quartile when looking at the distribution of all study sample sizes included in a single meta-analysis. The median threshold from which we considered a study sample size “small” was 46 participants, and the range varied between 28 and 74.

Assessment of the Overall Quality of the Evidence

The overall quality of the evidence regarding the efficacy of SGs on each outcome was assessed by using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach.32 The GRADE formalizes the evaluation of the overall quality of evidence and the formulation of recommendations. Quality of evidence depends on risks of bias, inconsistencies, imprecisions, and indirectness in the results of the studies. For each outcome, there are 4 levels of quality of evidence (very low, low, moderate, high), which represent our confidence in the pooled SMDs (ie, the findings of this review). Two authors independently assessed the quality of the evidence, and all disagreements were resolved through consensus.

RESULTS

Descriptive Results of Included Studies

From a pool of 3173 unique references, 37 studies were included in the systematic review and 29 studies (78%; all percentages presented are out of the 37 studies included) provided enough data to be included in a meta-analysis (Fig. 1). Descriptive data regarding included studies are reported in Table 1 (see Text, Supplementary Digital Content 3, https://links.lww.com/SIH/A574, and 4, https://links.lww.com/SIH/A575, the lists of included studies and of excluded studies at the full-text assessment stage are reported).

F1
FIGURE 1:
The PRISMA flow diagram.
TABLE 1 - Key Information of Included Studies
First Author (Year), Country Study Design Study Participants Outcomes*
Compared with classroom learning
 Courtier et al33 (2016), United States Two-group cluster RCT 48 fourth-year medical students Experiential engagement
Knowledge
 Diehl et al34 (2017), Brazil Two-group RCT 170 primary care physicians Attitudes†
Behaviors†
Cognitive skills†‡
Experiential engagement
 Hannig et al35 (2013), Germany Two-group RCT 55 second-year dental students Confidence
 Knight et al36 (2010), United Kingdom Two-group RCT 91 various healthcare professionals (eg, medical doctors, nurses, paramedic) Cognitive skills‡
Procedural skills
Compared with written material
 Boeker et al37 (2013), Germany Two-group RCT 145 third-year medical students Experiential engagement
Knowledge‡
 Polivka et al38 (2019), United States Two-group RCT 74 various healthcare professionals and students Cognitive skills
 Rondon et al39 (2013), Brazil Two-group RCT 29 second-year speech language and hearing science students Knowledge†
Compared with e-learning
 Adjedj et al40 (2017), France Two-group randomized crossover trial 68 medical students Experiential engagement‡
 Berger et al41 (2018), Switzerland Two-group RCT 117 second-year pharmacy students Attitude
Confidence
Experiential engagement
Knowledge
 Buijs-Spanjers et al42 (2018), the Netherlands Three-group RCT 176 third-year medical students Attitude
Cognitive skills‡
Experiential engagement‡
 Dankbaar et al43 (2017), the Netherlands Two-group RCT 90 fourth-year medical students Behavioral engagement‡
Behaviors
Confidence
Experiential engagement
Knowledge
 de Sena et al44 (2019), Brazil Two-group RCT 45 first-year medical students Behavioral engagement‡
Knowledge
Procedural skills
 Drummond et al45 (2017), France Two-group RCT 82 second-year medical students Procedural skills†
 Gauthier et al46 (2015), Canada Two-group RCT 46 first-year medical anatomy students Behavioral engagement
Knowledge
 Kerfoot et al47 (2014), United States Two-group RCT 111 physicians, nurse practitioners, and physician assistants Behavioral engagement
Clinical outcome in patients‡
Knowledge‡
 Mohan et al48 (2017), United States Four-group RCT 368 emergency medicine physicians Behavioral engagement
Cognitive skills†‡
Experiential engagement
 Scales et al49 (2016), United States Two-group RCT 422 resident physicians from various training specialties Knowledge
 Sward et al50 (2008), United States Two-group RCT 100 third-year medical students Experiential engagement
Knowledge†
Compared with another SG
 Buijs-Spanjers et al51 (2019), the Netherlands Two-group RCT 159 third-year medical students Attitudes
Cognitive skills
Experiential engagement
 Haubruck et al52 (2018), Germany Two-group RCT 95 third- to six-year medical students Experiential engagement‡
Procedural skills‡
 Kerfoot and Baker53 (2012), United States Two-group RCT 1470 urologists from various countries Knowledge‡
Compared with simulation or virtual simulation
 Chee et al54 (2019), Singapore Two-group RCT 46 registered nurses Confidence‡
Procedural skills‡
 Chien et al55 (2013), United States Two-group RCT 14 medical students Procedural skills
 Katz et al56 (2017), United States Two-group RCT 44 residents on liver transplant rotation Procedural skills‡
Compared with multiple interventions
 Brull et al57 (2017), United States Three-group RCT 115 newly graduated nurses at an urban community teaching hospital
Compared with classroom learning and e-learning
Knowledge‡
 Dankbaar et al58 (2016), the Netherlands Three-group RCT 79 fourth-year medical students
Compared with e-learning and no intervention
Behavioral engagement
Cognitive skills
Experiential engagement
 Mohan et al59 (2018), United States Four-group RCT 320 emergency medicine physicians
Compared with e-learning, SG, and no intervention
Behavioral engagement
Cognitive skills‡
Experiential engagement
Compared with no intervention
 Boada et al60 (2015), Spain Two-group RCT 109 second-year nursing students Procedural skills‡
 Cook et al61 (2012), United Kingdom Two-group RCT 34 third-year nursing students Procedural skills
 Del Blanco et al62 (2017), Spain Two-group RCT 132 second- and third-year nursing and medicine students Behaviors‡
Confidence
 Foss et al63 (2014), Norway Two-group RCT 201 first- and second-year undergraduate nursing students Cognitive skills
 Graafland et al64 (2017), the Netherlands Two-group RCT 31 first- or second-year residents in general surgical training Cognitive skills
 Harrington et al65 (2018), Ireland Two-group RCT 20 first- to third-year medical students Procedural skills‡
 Lagro et al66 (2014), the Netherlands Two-group RCT 145 fifth-year medical students Attitude
Knowledge
 Li et al67 (2015), China Two-group RCT 97 freshman medical students Procedural skills‡
 Tan et al68 (2017), Singapore Two-group cluster RCT 111 second-year nursing students Confidence‡
Knowledge‡
Procedural skills
 Van Nuland et al69 (2014), Canada Three-group crossover RCT 67 kinesiology students Knowledge‡
*Comparator refers to an intervention solely received by the control group (ie, an intervention that is not shared with the experimental group).
†An outcome that was also measured at a follow-up period.
‡An outcome for which there was a statistically significant difference favoring the experimental group.

The median publication year was 2017. Twenty-eight studies (76%) were conducted exclusively among healthcare students, 8 studies (22%) exclusively among healthcare professionals, and 1 study (3%) among both healthcare professionals and students. Regarding the professions, many studies were conducted in the medical profession (n = 24, 65%). The median sample size was 91 participants [interquartile range (IQR) = 99]. The median attrition rates were 6.61% (IQR = 20.89) at posttest assessment and 20% (IQR = 25.28) at a follow-up period (ie, between 6-week and 6-month postintervention). E-learning interventions (n = 14, 38%) were the most frequent types of comparator intervention.

Three studies (8%) compared SGs between one another. It was shown in one of the included studies that a voluntarily poor decision making in an SG compared with a “normal” one did not influence the improvement of cognitive skills [SMD = 0.00 (95% CI = −0.31 to 0.31)].51 Another study showed that more frequent but lighter sessions of SG usage led to higher knowledge acquisition than fewer but more intensive sessions of SG usage [SMD = 0.43 (95% CI = 0.30 to 0.56)].53 The last one focused on the evaluation of 2 similar SGs; the experimental group received an SG an educational content aligned with the learning objectives as the control group received an SG with a similar, but irrelevant educational content, to avoid compensatory equalization in this group.52 The group that received the SG that was aligned with the learning objective had significantly higher procedural skills compared with the control group [SMD = 1.30 (95% CI = 0.85 to 1.74)].

Otherwise, in 10 studies (27%), the control group received no intervention or no intervention other than one shared with the experimental group (eg, both experimental and control groups shared the same classroom-learning activity).

Risk of Bias, Selective Reporting of Outcomes in Included Studies, and Reporting Biases

Seven studies (19%) were judged at low risk of bias. The risk of bias graph is presented in Figure 2 (also see Figure, Supplementary Digital Content 5, https://links.lww.com/SIH/A576, the risk of bias summary for each study is presented). Other studies were judged at high risk of bias, mainly because of reporting or methodological issues at study level regarding the randomization sequence generation (n = 17, 46%) and the allocation concealment (n = 29, 78%). Only 6 studies (16%) were prospectively registered or had published a protocol before the publication of the results.

F2
FIGURE 2:
Risk of bias graph.

A funnel plot was constructed for the “Knowledge” outcome (see Figure, Supplementary Digital Content 6, https://links.lww.com/SIH/A577, the funnel plot is presented). Visual inspection of the funnel plot revealed no serious reporting biases at the body of literature level.

Description of the SGs

Most SGs were exclusively available on a computer (n = 24, 65%), 7 (19%) were offered exclusively on a portable or a handheld device, 4 (11%) were available on more than 1 platform, and 2 (5%) were available on an unspecified platform. Clinical topics were diverse; cardiac resuscitation (n = 5, 14%), triaging (n = 3, 8%), and anatomy (n = 3, 8%) were the most frequent. Approximately half of the included studies reported the expected frequency of using the SG (n = 19, 51%) or its duration (n = 23, 62%). In those, the median expected frequency of usage was 1 session (IQR = 3) and the median expected duration of usage was 60 minutes (IQR = 150). Data related to the cost and time of development of the SG were not reported in any study (also see Table, Supplemental Digital Content 7, https://links.lww.com/SIH/A578, key information of the SGs assessed are presented).

Ten studies (27%) cited the theoretical framework that guided the design or the development of the SG; no theoretical framework was cited more than once across the 37 included studies. Two studies (5%) reported the use of a game-based learning theory to guide the design of the SG. The most frequent challenges in SGs included the assessment and/or management of a virtual patient presenting a health-related illness condition (n = 17, 46%) and answering questions on a clinical topic (n = 10, 27%).

Efficacy of SGs in Supporting Behavioral and Experiential Engagement

Our confidence in the results of all meta-analyses is presented in Table 2. Our confidence ranges from very low to low for almost all meta-analyses conducted because mostly of serious risks of bias in included studies, inconsistencies, and imprecisions in results.

T2
TABLE 2:
Summary of Our Certainty in the Quantitative Evidence Using the GRADE Approach

Five studies (14%) were included in a meta-analysis (Fig. 3) to evaluate the efficacy of SGs on behavioral engagement (ie, minutes spent with the interventions). A nonstatistically significant result favoring SGs was found [mean difference (in minutes) = 23.21 (95% CI = −1.25 to 47.66, I2 = 91%)]. Heterogeneity remained high (>50%) when conducting planned subgroup and sensitivity analyses (see Figures, Supplementary Digital Content 8, https://links.lww.com/SIH/A579, subgroup and sensitivity analysis graphs are presented).

F3
FIGURE 3:
Meta-analysis of the efficacy of SGs on supporting behavioral engagement. IV, inverse of the variance; SD, standard deviation.

Five studies (14%) reported enough data to allow a comparison between the expected behavioral engagement in the SG and the actual one in terms of time spent using the SG. In 3 studies, the actual time spent was shorter than expected33,58,65; in one, it was longer,48 and in one, it was as expected.59

Experiential engagement was contrasted between groups in 11 studies (30%). Aspects of experiential engagement that were assessed are the following: perceived learning efficacy (n = 9, 24%), enjoyment (n = 7, 19%), satisfaction (n = 6, 16%), usability (n = 5, 14%), appropriateness (n = 4, 11%), focus (n = 4, 11%), fidelity (n = 4, 11%), difficulty (n = 2, 5%), perceived learning efficiency (n = 2, 5%), and stress (n = 1, 3%). Results regarding the efficacy of SGs on each aspect of experiential engagement are reported in Table 3. Results were highly heterogeneous overall; SGs were rarely regarded as systematically superior to other educational interventions for any of the aspects identified.

TABLE 3 - Participants' Self-reported Assessment Regarding the Following Aspects of the Experiential Engagement
Comparator
Classroom Learning Written Material E-learning
Courtier et al33 (2016) Diehl et al34 (2017) Haubruck et al52 (2018) Boeker et al37 (2013) Adjed et al40 (2017) Berger et al41 (2018) Dankbaar et al58 (2016) Dankbaar et al43 (2017) Mohan et al59 (2018) Mohan et al48 (2017) Sward et al50 (2008)
Appropriateness
The intervention format and its content are judged appropriate and credible for learning.
+* +*
Difficulty
The level of knowledge and skills required to progress in the intervention is adequate to learners' expertise.
+ −*
Enjoyment
The intervention is enjoyable or pleasant for learners.
−* +* +* −* +*
Fidelity
The intervention is representative of reality, as learners perceives it.
+* +* −*
Focus
The intervention allows learners to concentrate on the content presented.
+ +* +*
Learning efficacy
The intervention allows learners to feel that their learning has progressed.
−* + +* +* +* + +* + +
Learning efficiency
Learners perceive positively the ratio of time and effort invested in the intervention versus their learning progression.
−* +
Satisfaction
The intervention fulfills learners' overall expectations and needs.
+* +* +
Stress
The intervention is perceived as excessively demanding by learners.
Usability
The intervention is perceived as easy to use by learners.
+ −* + −* −*
*This difference reached statistical significance.
+, The SG was rated as superior to the comparator intervention; ○, no difference was reported; −, the SG was rated as inferior to the comparator intervention.

Efficacy of SGs in Improving Learning Outcomes

Knowledge

Fifteen studies (41%) assessed participants' acquisition of knowledge, and 11 studies (30%) were included in meta-analysis (Fig. 4). We observed a negligible and nonstatistically significant SMD of 0.16 (95% CI = −0.20 to 0.52, I2 = 86%) in favor of SGs. The number of included studies in this meta-analysis allowed for subgroup and sensitivity analyses. Statistical heterogeneity remained high (I2 ≥ 50%) in all subgroup analyses. However, a statistically significant difference (P = 0.006) was found between the pooled SMD of studies conducted with healthcare students [SMD = −0.19 (95% CI = −0.66 to 0.28), I2 = 85%] compared with the pooled SMD of studies conducted with healthcare professionals [SMD = 0.80 (95% CI = 0.27 to 1.32), I2 = 84%]. When removing the studies falling under the first quartile in term of sample size (fewer than 48 participants), statistical heterogeneity also remained high. However, the result became statistically significant [SMD = 0.48 (95% CI = 0.19 to 0.78), I2 = 77%]. Two studies (5%) assessed participants' retention of knowledge, one after a follow-up period of 6 weeks and the other one after 6 months. At 6 weeks, a negligible and nonstatistically significant difference between an SG and written material was found [SMD = 0.05 (95% CI = −0.74 to 0.83)].39 At 6 months, a negligible and nonstatistically significant difference between an SG and an e-learning intervention was found [SMD = −0.14 (95% CI = −0.60 to 0.32)].50

F4
FIGURE 4:
Meta-analysis of the efficacy of SGs on knowledge acquisition. IV, inverse of the variance; SD, standard deviation.

Skills

Ten studies (27%) assessed participants' cognitive skills, and 5 (11%) of them were included in a meta-analysis (Fig. 5). We observed a negligible and nonstatistically significant SMD of 0.08 (95% CI = −0.73 to 0.89, I2 = 95%) in favor of SGs. Heterogeneity remained high (>50%) when exploring the potential effect of study population, and publication year. Two studies included a follow-up period. At 3 months, a small and nonstatistically significant difference [SMD = 0.23 (95% CI = −0.11 to 0.57)] favoring the SG compared with a classroom learning intervention was found, and at a follow-up period of 6 months, a small and statistically significant difference [SMD = 0.46 (95% CI = 0.32 to 0.68)] favoring the SG compared with an e-learning intervention was reported.34,48

F5
FIGURE 5:
Meta-analysis of the efficacy of SGs on improving cognitive skills. IV, inverse of the variance; SD, standard deviation.

Twelve studies (32%) assessed participants' procedural skills. Four studies (11%) were included in this meta-analysis (Fig. 6). We observed a negligible and nonstatistically significant SMD of 0.05 (95% CI = −0.78 to 0.87, I2 = 88%). The heterogeneity remained high (>50%) when exploring the potential effect of the comparator intervention, study population, and publication year. One study (3%) included a 4-month follow-up period and a negligible and nonstatistically significant difference [SMD = −0.16 (95% CI = −0.62 to 0.29)] favoring the e-learning intervention was found.45

F6
FIGURE 6:
Meta-analysis of the efficacy of SGs on improving procedural skills. IV, inverse of the variance; SD, standard deviation.

Six studies (16%) assessed participants' confidence in their skills, and 4 (11%) of them were included in a meta-analysis (Fig. 7). We observed a small and statistically significant SMD of 0.27 (95% CI = 0.01 to 0.53, I2 = 0%) in favor of SGs. Nonsignificant differences between groups were found while exploring the potential effect of study population, publication year, and the comparator intervention.

F7
FIGURE 7:
Meta-analysis of the efficacy of SGs on improving confidence in skills. IV, inverse of the variance; SD, standard deviation.

Attitude

Five studies (14%) assessed participants' attitude. Three studies (8%) were included in this meta-analysis (Fig. 8). We observed a negligible and nonstatistically significant SMD of −0.09 (95% CI = −0.38 to 0.20, I2 = 47%) in favor of comparator educational interventions. The low number of included studies in the main meta-analysis precluded us from conducting other subgroup or sensitivity analyses.

F8
FIGURE 8:
Meta-analysis of the efficacy of SGs on attitude change. IV, inverse of the variance; SD, standard deviation.

Behavior

Three studies (8%) assessed the perception of behavior change in practice. Two studies (5%) were included in this meta-analysis (Fig. 9). We observed a negligible and nonstatistically significant SMD of 0.2 (95% CI = −0.11 to 0.51, I2 = 0%) in favor of SGs.

F9
FIGURE 9:
Meta-analysis of the efficacy of SGs on behavior change. IV, inverse of the variance; SD, standard deviation.

Clinical Outcomes

Only 1 study (3%) included the assessment of a clinical outcome in healthcare system users (ie, number of days to blood pressure target in the patients who were taken care by the healthcare providers participating in the study) and the authors reported a statistically significant difference (P = 0.018) favoring the SG compared (median = 142 days) to another e-learning intervention (median = 148 days).47

DISCUSSION

This systematic review examined the efficacy of SG in healthcare professions education. Most studies were published in the last 3 years, among students, in the medical profession, and compared an SG with another e-learning intervention. We found negligible and nonstatistically significant differences between SGs and other educational interventions regarding their effects on knowledge acquisition, cognitive and procedural skill development in a test setting, behavior change in clinical practice, or supporting engagement during the learning activities. In addition, heterogeneous results were found regarding the efficacy of SGs to support any of the identified aspects of experiential engagement. This systematic review adds to previous reviews on SGs in healthcare professions education by synthesizing the latest evidence of their efficacy, by evaluating the assumption that SGs are more engaging than other educational interventions, by quantifying their efficacy, and by exploring various sources of heterogeneity through meta-analytic methods.

Educators should be aware of the limited evidence supporting the engaging nature of SGs with healthcare professionals and students. Mixed findings regarding engagement are surprising considering that the decision to use an SG is often motivated by their potential to improve learners' engagement.34,58,64 The concept of engagement has a large scope that makes it almost an umbrella term for multiple emotional or cognitive states and numerous behaviors.9 As such, we remained inclusive of all measures used by authors that were linked either to the behavioral or experiential dimension of engagement while using the intervention. However, these findings are impeded by the small number of studies reporting engagement outcomes and the lack of information regarding the validity or the reliability of the assessment tools used in half of the studies. Authors should consider assessing learners' engagement across educational interventions using validated and reliable assessment tools, such as the evaluation questionnaire developed by Dankbaar et al43 in their study or the usability questionnaire developed by Zaharias and Povlymenakou.70

Nonsignificant differences between SGs and other educational interventions were found for most learning outcomes. Our findings are in line with the ones reported in previous reviews regarding the efficacy of SGs in improving learning outcomes.12–14 Authors of these previous reviews underlined the mixed efficacy of SGs to improve learning outcomes compared with other educational interventions. Our meta-analyses showed that for most learning outcomes and no matter what the comparator educational intervention was, the overall body of the evidence did not support the claim that SGs were significantly more effective. The lack of theoretical framework to support SG design could serve as an explanation for these results as most authors did not explicate a theoretical framework for the design of their SGs and only 2 explicitly referred to a game-based learning theory.6,20,47,68 Designing an SG through a theoretical lens holds the potential to greatly improve learners' engagement and learning outcomes.71 Theoretical works should be undertaken and synthesized to explain the mechanisms through which SGs are expected to lead to learning outcomes.

Furthermore, authors of recent RCTs underlined the ongoing difficulty in identifying empirical data to support their design choices.43,57 We had initially planned in the published protocol to evaluate the individual impact of SG design elements on engagement and learning outcomes.18 Unfortunately, scarce data prevented us from doing so. Few included studies compared different versions of an SG between one another, which is essential to isolate the impact of individual design choices.47,51,52,72 Future studies should focus on evaluating the efficacy of different versions of an SG on engagement and learning outcomes.

Regarding the long-term retention of learning outcomes, only 5 studies included a follow-up period and 3 of them reported nonsignificant differences between groups. It should be noted that the median expected frequency and duration of usage is a single 60-minute session and that learners have been shown in some studies to use the SGs less than expected.33,58 It could be hypothesized that the duration of SG use is not enough to bring greater long-term changes in learning outcomes compared with other educational interventions. Future studies should consider assessing participants' long-term retention of learning outcomes, as there is insufficient evidence to support SG efficacy in the long-term compared with other educational interventions.

As the development of an SG can be a resource-intensive endeavor and as some SGs are commercialized after their evaluation, researchers should consider prospectively registering their trial or publishing their research protocol to improve the transparency in the reporting of their results and to avoid any suspicion of potential conflicts of interest.73 This would facilitate the evaluation of the selective reporting of outcomes in result articles. Moreover, regarding the reporting, most studies were judged at high risk of bias as the reporting of the randomization sequence generation and the allocation concealment were unclear. Future studies should make sure to report all elements necessary to their assessment. The adoption of reporting grids, such as the Consolidated Standards of Reporting Trials grid, by journals, and the use of them by researchers, could greatly improve the reporting of these trials.74

Strengths of this systematic review include the prospective publication of the protocol18 and the reporting of the results according to the PRISMA guidelines, enhancing the transparency of the research process.75 Furthermore, the data extraction process were piloted, and all data extraction forms were validated by a second review author. Limits of this review include the selection of RCTs only and their relatively low number. Following the Cochrane guidance, we restricted this review to RCTs to minimize threats to internal validity and as we were aware that the efficacy of SGs had already been evaluated in multiple RCTs.28 Another limit include a potential language bias as only studies published in English and in French were considered. However, the visual inspection of the funnel plot did not allow for the identification of a significant language bias or other types of reporting biases. Furthermore, as the nature of what constitute an SG and how it differs from interventions such as virtual simulations is still a matter of debate,76 we remained inclusive in our definition of SGs. To address potential ambiguities regarding the nature of study interventions, we screened all references independently and in pairs, and all disagreements were resolved through discussion with a third author. Still, we recognize our inclusive definition of SGs as a potential limit to our work.

Compared with other educational interventions, SGs led to neither statistically better behavioral engagement, knowledge acquisition, cognitive and procedural skills development, attitude change, nor behavior change. Only a statistically significant but small SMD was found in favor of SGs to improve confidence in skills. In addition, heterogeneous results were found regarding the efficacy of SGs to support any of the identified aspects of experiential engagement. Our findings are impeded by high or unclear risk of bias across studies, inconsistencies in the directions of effect, and imprecisions of study results. As such, our confidence ranges from very low to low regarding the results of almost all meta-analyses that were conducted. We recommend that authors base their SG design choices on a theoretical framework and that they report their results according to the Consolidated Standards of Reporting Trials statement. Moreover, future research should focus on assessing whether healthcare professionals' clinical practice changes occur in post–SG training, clinical outcomes in patients under the care of healthcare professionals that used SGs, and long-term retention of learning outcomes.

REFERENCES

1. Akl EA, Pretorius RW, Sackett K, et al. The effect of educational games on medical students' learning outcomes: a systematic review: BEME Guide No 14. Med Teach 2010;32(1):16–27.
2. Baranowski T, Buday R, Thompson DI, Baranowski J. Playing for real: video games and stories for health-related behavior change. Am J Prev Med 2008;34(1):74–82.
3. Knowles MS. L'apprenant adulte: vers un nouvel art de la formation. Paris, France: Editions d'Organisation; 1995.
4. Yoder SL, Terhorst R 2nd. “Beam me up, scotty”: designing the future of nursing professional development. J Contin Educ Nurs 2012;43(10):456–462.
5. Wynter L, Burgess A, Kalman E, Heron JE, Bleasel J. Medical students: what educational resources are they using? BMC Med Educ 2019;19(1):36.
6. Kiili K. Digital game-based learning: towards an experiential gaming model. Internet High Educ 2005;8(1):13–24.
7. Tettegah S, McCreery M, Blumberg F. Toward a framework for learning and digital games research. Educ Psychol 2016;50(4):253–257.
8. De Freitas S. Learning in Immersive Worlds: A Review of Game-Based learning. Bristol, United Kingdom: Joint Information Systems Committee; 2006.
9. Perski O, Blandford A, West R, Michie S. Conceptualising engagement with digital behaviour change interventions: a systematic review using principles from critical interpretive synthesis. Transl Behav Med 2017;7(2):254–267.
10. Nevin CR, Westfall AO, Rodriguez JM, et al. Gamification as a tool for enhancing graduate medical education. Postgrad Med J 2014;90(1070):685–693.
11. Gorbanev I, Agudelo-Londono S, Gonzalez RA, et al. A systematic review of serious games in medical education: quality of evidence and pedagogical strategy. Med Educ Online 2018;23(1):1438718.
12. Ricciardi F, De Paolis LT. A comprehensive review of serious games in health professions. IJCGT 2014;2014:1–11.
13. Wang R, DeMaria S Jr., Goldberg A, Katz D. A systematic review of serious games in training health care professionals. Simul Healthc 2016;11(1):41–51.
14. Boyle EA, Hainey T, Connolly TM, et al. An update to the systematic literature review of empirical evidence of the impacts and outcomes of computer games and serious games. Comput Educ 2016;94:178–192.
15. Gentry S, Gauthier A, L'Estrade Ehrstrom B, et al. Serious gaming and gamification education in health professions: a systematic review by the Digital Health Education Collaboration. J Med Internet Res 2019;21(3):e12994.
16. Higgins JPT, Green S. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0. West Sussex, United Kingdom The Cochrane Collaboration; 2011.
17. Moher D, Liberati A, Tetzlaff J, Altman DG; PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med 2009;6(7):e1000097.
18. Maheu-Cadotte MA, Cossette S, Dube V, et al. Effectiveness of serious games and impact of design elements on engagement and educational outcomes in healthcare professionals and students: a systematic review and meta-analysis protocol. BMJ Open 2018;8(3):e019871.
19. Hamari J, Shernoff DJ, Rowe E, Coller B, Asbell-Clarke J, Edwards T. Challenging games help students learn: an empirical study on engagement, flow and immersion in game-based learning. Comput Hum Behav 2016;54:170–179.
20. Salen K, Zimmerman E. Rules of Play: Game Design Fundamentals. Cambridge, MA: MIT Press; 2004.
21. Bergeron B. Developing Serious Games. 1st ed. Newton Centre, MA: Charles River Media; 2006.
22. Stokes BG. Videogames have changed: time to consider serious games? Dev Educ J 2005;11(3):12.
23. Kirkpatrick DL. Luminary perspective: evaluating training programs. In: Biech E, ed. ASTD Handbook for Workplace Learning Professionals Alexndira. Virginia: ASTD Press; 2008:485–491.
24. Vlachopoulos D, Makri A. The effect of games and simulations on higher education: a systematic literature review. Int J Educ Technol High Educ 2017;14(1):22.
25. Cochrane Effective Practice and Organisation of Care Review Group (EPOC). Data Collection Checklist. Ottawa, Canada: Institute of Population Health, University of Ottawa; 2002.
26. Cochrane Effective Practice and Organisation of Care (EPOC). Suggested risk of bias criteria for EPOC reviews. EPOC Resources for review authors Web site. Available at: http://epoc.cochrane.org/resources/epoc-resources-review-authors. Published 2017. Accessed May 23, 2019.
27. Savovic J, Turner RM, Mawdsley D, et al. Association between risk-of-bias assessments and results of randomized trials in Cochrane reviews: the ROBES Meta-Epidemiologic Study. Am J Epidemiol 2018;187(5):1113–1122.
28. Higgins JP, Green S. Cochrane Handbook for Systematic Reviews of Interventions. Vol 4: West Sussex, United Kingdom: John Wiley & Sons; 2011.
29. Johnson L, Adams Becker S, Estrada V, Freeman A. NMC Horizon Report: 2014 Higher Education Edition. Austin, TX: The New Media Consortium; 2014.
30. Turner RM, Bird SM, Higgins JPT. The impact of study size on meta-analyses: examination of underpowered studies in Cochrane reviews. PloS One 2013;8(3):e59202–e59202.
31. Schwarzer G, Carpenter JR, Rücker G. Small-study effects in meta-analysis. In: Meta-Analysis With R. Cham: Springer International Publishing; 2015:107–141.
32. Dijkers M. Introducing GRADE: a systematic approach to rating evidence in systematic reviews and to guideline development. KT Update 2013;1(5):1–9.
33. Courtier J, Webb EM, Phelps AS, Naeger DM. Assessing the learning potential of an interactive digital game versus an interactive-style didactic lecture: the continued importance of didactic teaching in medical student education. Pediatr Radiol 2016;46(13):1787–1796.
34. Diehl LA, Souza RM, Gordan PA, Esteves RZ, Coelho IC. InsuOnline, an electronic game for medical education on insulin therapy: a randomized controlled trial with primary care physicians. J Med Internet Res 2017;19(3):e72.
35. Hannig A, Lemos M, Spreckelsen C, Ohnesorge-Radtke U, Rafai N. Skills-O-Mat: computer supported interactive motion- and game-based training in mixing alginate in dental education. J Educ Comput Res 2013;48(3):315–343.
    36. Knight JF, Carley S, Tregunna B, et al. Serious gaming technology in major incident triage training: a pragmatic controlled trial. Resuscitation 2010;81(9):1175–1179.
    37. Boeker M, Andel P, Vach W, Frankenschmidt A. Game-based e-learning is more effective than a conventional instructional method: a randomized controlled trial with third-year medical students. PLoS One 2013;8(12):e82328.
    38. Polivka BJ, Anderson S, Lavender SA, et al. Efficacy and usability of a virtual simulation training system for health and safety hazards encountered by healthcare workers. Games Health J 2019;8(2):121–128.
    39. Rondon S, Sassi FC, Furquim de Andrade CR. Computer game-based and traditional learning method: a comparison regarding students' knowledge retention. BMC Med Educ 2013;13:30.
    40. Adjedj J, Ducrocq G, Bouleti C, et al. Medical student evaluation with a serious game compared to multiple choice questions assessment. JMIR Serious Games 2017;5(2):e11.
      41. Berger J, Bawab N, De Mooij J, et al. An open randomized controlled study comparing an online text-based scenario and a serious game by Belgian and Swiss pharmacy students. Curr Pharm Teach Learn 2018;10(3):267–276.
        42. Buijs-Spanjers KR, Hegge HHM, Jansen CJ, Hoogendoorn E, de Rooij SE. A web-based serious game on delirium as an educational intervention for medical students: randomized controlled trial. JMIR Serious Games 2018;6(4):e17.
          43. Dankbaar ME, Richters O, Kalkman CJ, et al. Comparative effectiveness of a serious game and an e-module to support patient safety knowledge and awareness. BMC Med Educ 2017;17(1):30.
          44. de Sena DP, Fabrício DD, da Silva VD, Bodanese LC, Franco AR. Comparative evaluation of video-based on-line course versus serious game for training medical students in cardiopulmonary resuscitation: a randomised trial. PLoS One 2019;14(4):e0214722.
            45. Drummond D, Delval P, Abdenouri S, et al. Serious game versus online course for pretraining medical students before a simulation-based mastery learning course on cardiopulmonary resuscitation: a randomised controlled study. Eur J Anaesthesiol 2017;34(12):836–844.
            46. Gauthier A, Corrin M, Jenkinson J. Exploring the influence of game design on learning and voluntary use in an online vascular anatomy study aid. Comput Educ 2015;87:24–34.
              47. Kerfoot BP, Turchin A, Breydo E, Gagnon D, Conlin PR. An online spaced-education game among clinicians improves their patients' time to blood pressure control: a randomized controlled trial. Circ Cardiovasc Qual Outcomes 2014;7(3):468–474.
              48. Mohan D, Farris C, Fischhoff B, et al. Efficacy of educational video game versus traditional educational apps at improving physician decision making in trauma triage: randomized controlled trial. BMJ 2017;359:j5416.
              49. Scales CD Jr, Moin T, Fink A, et al. A randomized, controlled trial of team-based competition to increase learner participation in quality-improvement education. Int J Qual Health Care 2016;28(2):227–232.
              50. Sward KA, Richardson S, Kendrick J, Maloney C. Use of a Web-based game to teach pediatric content to medical students. Ambul Pediatr 2008;8(6):354–359.
              51. Buijs-Spanjers KR, Hegge HHM, Cnossen F, Hoogendoorn E, Jaarsma DADC, de Rooij SE. Dark play of serious games: effectiveness and features (G4HE2018). Games Health J 2019;8(4):301–306.
              52. Haubruck P, Nickel F, Ober J, et al. Evaluation of app-based serious gaming as a training method in teaching chest tube insertion to medical students: randomized controlled trial. J Med Internet Res 2018;20(5):e195.
              53. Kerfoot BP, Baker H. An online spaced-education game for global continuing medical education: a randomized trial. Ann Surg 2012;256(1):33–38.
              54. Chee EJM, Prabhakaran L, Neo LP, et al. Play and learn with patients-designing and evaluating a serious game to enhance nurses' inhaler teaching techniques: a randomized controlled trial. Games Health J 2019;8(3):187–194.
                55. Chien JH, Suh IH, Park S-H, Mukherjee M, Oleynikov D, Siu K-C. Enhancing fundamental robot-assisted surgical proficiency by using a portable virtual simulator. Surg Innov 2013;20(2):198–203.
                56. Katz D, Zerillo J, Kim S, et al. Serious gaming for orthotopic liver transplant anesthesiology: a randomized control trial. Liver Transpl 2017;23(4):430–439.
                57. Brull S, Finlayson S, Kostelec T, MacDonald R, Krenzischeck D. Using gamification to improve productivity and increase knowledge retention during orientation. J Nurs Adm 2017;47(9):448–453.
                58. Dankbaar ME, Alsma J, Jansen EE, van Merrienboer JJ, van Saase JL, Schuit SC. An experimental study on the effects of a simulation game on students' clinical cognitive skills and motivation. Adv Health Sci Educ Theory Pract 2016;21(3):505–521.
                59. Mohan D, Fischhoff B, Angus DC, et al. Serious games may improve physician heuristics in trauma triage. Proc Natl Acad Sci U S A 2018;115(37):9204–9209.
                60. Boada I, Rodriguez-Benitez A, Garcia-Gonzalez JM, Olivet J, Carreras V, Sbert M. Using a serious game to complement CPR instruction in a nurse faculty. Comput Methods Programs Biomed 2015;122(2):282–291.
                61. Cook NF, McAloon T, O'Neill P, Beggs R. Impact of a web based interactive simulation game (PULSE) on nursing students' experience and performance in life support training–a pilot study. Nurse Educ Today 2012;32(6):714–720.
                62. Del Blanco Á, Torrente J, Fernández-Manjón B, Ruiz P, Giner M. Using a videogame to facilitate nursing and medical students' first visit to the operating theatre. A randomized controlled trial. Nurse Educ Today 2017;55:45–53.
                63. Foss B, Løkken A, Leland A, Stordalen J, Mordt P, Oftedal BF. Digital game-based learning: a supplement for medication calculation drills in nurse education. E-Learning and Digital Media 2014;11(4):342–349.
                  64. Graafland M, Bemelman WA, Schijven MP. Game-based training improves the surgeon's situational awareness in the operation room: a randomized controlled trial. Surg Endosc 2017;31:4093–4101.
                  65. Harrington CM, Chaitanya V, Dicker P, Traynor O, Kavanagh DO. Playing to your skills: a randomised controlled trial evaluating a dedicated video game for minimally invasive surgery. Surg Endosc 2018;32(9):3813–3821.
                  66. Lagro J, van de Pol MHJ, Laan A, Huijbregts-Verheyden FJ, Fluit LCR, Olde Rikkert MGM. A randomized controlled trial on teaching geriatric medical decision making and cost consciousness with the serious game GeriatriX. J Am Med Dir Assoc 2014;15(12):957.e1–e6.
                  67. Li J, Xu Y, Xu Y, et al. 3D CPR game can improve CPR skill retention. Stud Health Technol Inform 2015;216:974.
                  68. Tan AJQ, Lee CCS, Lin PY, et al. Designing and evaluating the effectiveness of a serious game for safe administration of blood transfusion: a randomized controlled trial. Nurse Educ Today 2017;55:38–44.
                  69. Van Nuland SE, Roach VA, Wilson TD, Belliveau DJ. Head to head: the role of academic competition in undergraduate anatomical education. Anat Sci Educ 2015;8(5):404–412.
                  70. Zaharias P, Poylymenakou A. Developing a usability evaluation method for e-learning applications: beyond functional usability. Int J Human Comput Interact 2009;25(1):75–98.
                  71. Wu WH, Hsiao HC, Wu PL, Lin CH, Huang SH. Investigating the learning-theory foundations of game-based learning: a meta-analysis. J Comput Assist Learn 2012;28(3):265–279.
                  72. Cook DA. The research we still are not doing: an agenda for the study of computer-based learning. Acad Med 2005;80(6):541–548.
                  73. Sim I, Chan A-W, Gülmezoglu AM, Evans T, Pang T. Clinical trial registration: transparency is the watchword. Lancet 2006;367(9523):1631–1633.
                  74. Schulz KF, Altman DG, Moher D; CONSORT Group. CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. BMC Med 2010;8:18.
                  75. Moher D, Shamseer L, Clarke M, et al., PRISMA-P Group. Preferred Reporting Items for Systematic review and Meta-Analysis Protocols (PRISMA-P) 2015 statement. Syst Rev 2015;4(1):1.
                  76. Panzoli D, Lelardeux CP, Galaup M, Lagarrigue P, Minville V, Lubrano V. Interaction and communication in an immersive learning game: the challenges of modelling real-time collaboration in a virtual operating room. In: Ma M, Oikonomou A, Jain LC, eds. Serious Games and Edutainment Applications. Gewerbestrasse, Suisse: Springer; 2017:147–186.
                  Keywords:

                  Game-based learning; educational gaming; medical education; systematic review; meta-analysis

                  Supplemental Digital Content

                  Copyright © 2020 Society for Simulation in Healthcare