Cook, David A. MD, MHPE; Levinson, Anthony J. MD, MSc; Garside, Sarah MD, PhD; Dupras, Denise M. MD, PhD; Erwin, Patricia J. MLS; Montori, Victor M. MD, MSc
Internet-based medical education has proliferated rapidly since the advent of the World Wide Web in 1991. Potential advantages of Internet-based learning (IBL) over other instructional methods include flexibility in time and location of learning, economies of scale, facilitation of novel instructional methods, and the potential to personalize instruction to individual needs.1–3 Hundreds of published articles have described and evaluated the use of IBL in health professions education.4
Educators need evidence-based guidance on how to develop effective IBL.3 Over 200 studies have reported the results of comparing IBL either with no intervention or with traditional (noncomputer) instructional methods in health professions education.4 In a previous report (2008),4 we sought to identify salient principles regarding when and how to use IBL. While that meta-analysis supported the effectiveness of IBL, the evidence did not clearly identify principles to guide future implementations. Previous reviews have encountered similar limitations.5–8
An alternative approach to identifying evidence-based principles involves direct comparisons of one Internet-based intervention against another.9–13 Evidence from such studies, if properly reviewed and synthesized, could inform decisions on when and how to effectively use IBL. We are not aware of previous systematic reviews addressing these questions.
In the present study, we sought to identify and quantitatively summarize all studies involving health professions learners that compared IBL with another computer-based instructional format. We focused our review on health professions learners because—even if fundamental principles of learning apply broadly—the topics, learning objectives, and learners in health professions education vary from other fields of study.
We planned, conducted, and reported this review in adherence to standards of quality for reporting meta-analyses (QUOROM and MOOSE).14,15
We sought to answer the following question: “What characteristics of IBL interventions, as compared with other computer-based interventions, are associated with improved outcomes in health professions learners?”
We included studies published in any language that investigated use of the Internet, in comparison with another computer-based intervention, to teach health professions learners at any stage in training or practice, using the Kirkpatrick outcomes16 of (1) satisfaction, (2) knowledge or attitudes, (3) skills (in a test setting), and (4) behaviors (in practice) or effects on patients.
Definitions of key variables (e.g., cognitive interactivity) have been detailed previously.4 We defined health professions learners as students, postgraduate trainees (i.e., residents or fellows), or practitioners in a profession directly related to human or animal health, including physicians, nurses, pharmacists, dentists, veterinarians, and physical therapists. We defined IBL as computer-assisted instruction3 using the Internet or a local intranet as the means of delivery. This included Internet-based tutorials, virtual patients, discussion boards, e-mail, and Internet-mediated videoconferencing.
We excluded studies if all of the computer interventions investigated resided only on the client computer or CD-ROM or if the use of the Internet was limited to administrative or secretarial purposes. We also excluded studies that did not report outcomes of interest or were published only in abstract form.
We described our search strategy (Supplemental Digital Content, Box 1, http://links.lww.com/ACADMED/A14) previously.4 Briefly, one of us (P.J.E.), a senior reference librarian with expertise in systematic reviews, designed a strategy to search MEDLINE, CINAHL, EMBASE, Web of Science, Scopus, ERIC, TimeLit, and the University of Toronto Research and Development Resource Base. Search terms included words defining delivery concepts (such as Internet, Web, e-learning, and computer-assisted instruction) and participant characteristics (such as “education, professional,” “students, health occupations,” and “internship and residency”). Because the World Wide Web was first described in 1991, our search included all articles published after January 1, 1990. The last search date was November 17, 2008. We identified additional studies both by scanning authors' files and previous reviews and by hand searching the reference lists of all included articles.
Four of us (D.A.C., A.J.L., D.M.D., and S.G.), working independently and in duplicate, screened all titles and abstracts to determine whether we should include an article. In the event of disagreement or insufficient information in the abstract, we reviewed the full text, again independently and in duplicate. We resolved conflicts by consensus. Chance-adjusted interrater agreement for study inclusion, determined using intraclass correlation coefficient17 (ICC), was 0.73.
We developed a data abstraction form through iterative testing and revision. We abstracted data independently and in duplicate for all variables requiring reviewers' judgment. We determined interrater agreement using ICC, and we resolved conflicts by consensus.
We abstracted the following information:
* number and training level of learners,
* study design: presence of pretest (ICC: 0.84), number of groups (ICC: 0.97), and method of group assignment (ICC: 1.0),
* course length (ICC: 0.70),
* level of cognitive interactivity (low, moderate, high; ICC: 0.71),
* quantity of practice exercises (absent, few, many; ICC: 0.84),
* outcome assessment method (subjective/objective; ICC: 0.79), and
* quantitative outcome results (mean and standard deviation [SD], or other information for calculating effect size [ES]).
When articles contained insufficient outcomes data, we requested this information from authors.
Desiring to use a common quality metric for both randomized and observational studies, we abstracted information on methodological quality using an adaptation of the Newcastle–Ottawa scale for grading cohort studies.4,18 We rated the following:
* representativeness of the intervention group (ICC: 0.47),
* selection of the control group (ICC: 0.93),
* comparability of cohorts: either statistical adjustment for baseline characteristics in nonrandomized studies (ICC: 0.34) or randomization (ICC: 1.0) and allocation concealment for randomized studies (ICC: 0.38),
* blinding of outcome assessment (ICC: 0.54), and
* completeness of follow-up (ICC: 0.71).
Three of us (D.A.C, A.J.L., and S.G.) iteratively reviewed all articles to inductively identify themes among the research questions or research hypotheses and to achieve consensus on definitions (Table 1). Then, working first independently and then by consensus, we (again D.A.C, A.J.L, and S.G.) grouped each study by research theme.
We abstracted information separately for outcomes of satisfaction, knowledge, skills, and behaviors/patient effects. We converted means and SDs or odds ratios to standardized mean differences (Hedges' g effect sizes).19–22 When sufficient data were unavailable, we used tests of significance (e.g., P values) to back-calculate ESs using standard formulae.23 For crossover studies, we used (1) means or exact statistical test results adjusted for repeated measures or, if these were not available, (2) means pooled across each intervention.22,24 For two-group pretest–posttest studies, we used (1) posttest means or exact statistical test results adjusted for pretest or, if these were not available, (2) differences in change scores standardized using pretest variance. If neither P values nor any measures of variance were reported, we used the average SD from all other included studies.
Because we anticipated large inconsistency (heterogeneity), we used random-effects models to pool weighted ESs across studies within research themes. We used the I2 statistic25 to quantify inconsistency across studies. I2 estimates the percentage of variability across studies that is not due to chance; values greater than 50% indicate large inconsistency. We conducted meta-analyses to pool study results for all themes addressed by two or more studies, except where reviewers agreed that the implementations of that theme were too dissimilar. In addition to the inductively identified themes, we coded the level of cognitive interactivity and quantity of practice exercises, and we performed meta-analyses pooling the results of studies in which these codes varied between study arms. Some studies appeared in more than one meta-analysis. For each meta-analysis, we evaluated separately outcomes of satisfaction and learning (i.e., knowledge or—for the two studies not reporting knowledge—skills or behaviors). Too few studies reported skills and behaviors to permit meta-analysis using these outcomes alone. To explore the robustness of findings to synthesis assumptions, we conducted subgroup analyses based on method of group assignment and study quality.
We used StatsDirect 2.6.6 (Cheshire, United Kingdom) to pool ESs and SAS 9.1 (Cary, North Carolina) to calculate ICC. We defined statistical significance by a two-sided alpha of .05.
We identified 2,527 citations using our search strategy and an additional 178 articles from scanning author files and reviewing reference lists. From these we identified 369 potentially eligible articles (Figure 1), of which 51 reported comparisons between an Internet-based intervention and other computer-assisted instruction.26–76 Eight of these articles29,31,33,34,38,39,50,64 also reported comparisons with no intervention or with a noncomputer intervention, and these appeared in our previous systematic review.4 Two of the studies we included were published online ahead of print.60,73
We contacted authors of 17 articles for additional outcomes information and received information from 8. One otherwise eligible study32 contained insufficient data to calculate an ES for any outcome, so we excluded it from our final analysis. Four studies had more than two groups and/or used a factorial design to study more than one research theme.38,51,59,74 We included these studies no more than once per meta-analysis.
Table 2 summarizes key study features. The Internet-based courses addressed a wide range of medical topics such as chest pain, geriatric psychiatry, electrocardiogram interpretation, math skills, periodontology, communication skills, and psychotherapy. A total of 8,416 learners participated, including 3,290 medical students, 1,504 postgraduate physician trainees, 865 physicians in practice, 303 nurses in training, 485 practicing nurses, 153 dental students, 151 pharmacists in training, 73 practicing pharmacists, and 1,592 other learners (other allied health or mixed groups). Of the 51 studies we included, most (38 [75%]) reported knowledge outcomes and 29 (57%) reported satisfaction, whereas only 4 (8%) reported behaviors or patient effects and 3 (6%) reported skills (See Table 2 and Supplemental Digital Content, Table 1, http://links.lww.com/ACADMED/A14).
Table 3 summarizes the methodological quality of the included studies. Of the 51 studies, 30 (59%) were randomized. Three of 38 (8%) knowledge assessments, one of three (33%) skills assessments, and two of four (50%) assessments of behavior or patient effects used self-report measures. Two studies compared course completion rates as a measure of satisfaction51,61; all other studies used self-reported satisfaction. Fifteen (52%) of the 29 studies assessing satisfaction, 12 (32%) of the 38 studies assessing knowledge, 1 (33%) of the 3 studies assessing skills, and 1 (25%) of the 4 studies assessing behaviors and patient effects lost more than 25% of participants from the time of enrollment, or they failed to report follow-up. Quality scores (the number of quality criteria present; 6 points maximum) ranged from 0 to 6, with a mean of 3.3 (SD = 1.6).
We identified 22 distinct research themes (Table 1). Figure 2 summarizes meta-analysis results for themes addressed by two or more studies, as well as for studies in which the coded level of interactivity or practice exercises varied between groups. The Supplemental Digital Content (http://links.lww.com/ACADMED/A14, Table 2) has additional analysis details including subgroup analyses and study-specific ESs. Further details of analyses are available from the authors on request. To illustrate how studies varied while still focusing on the same theme, we present in the Supplemental Digital Content (http://links.lww.com/ACADMED/A14, Box 2) an in-depth examination of the theories, conceptual frameworks, and instructional methods for several studies reflecting one theme.
We did not identify research themes for all coded elements. For example, although we found differences in the coded quantity of practice exercises, we found no studies designed to compare such differences. Similarly, we found fewer studies hypothesizing differences between levels of interactivity (n = 15) than we found with differences in our coding of this feature (n = 21).
Studies investigating interactivity
Fifteen studies compared different levels of interactivity,35,39,40,43,49,50,52,56,57,59,62,68,70,73,74 defined as cognitive (mental) engagement with the course other than online discussion (which we analyzed separately, see below). Course designers enhanced engagement using a range of tasks, as outlined below (See also Table 2, Themes).
Eight of these studies explored the use of questions to enhance interactivity. Two of three randomized trials that compared the use of self-assessment questions versus no questions reported statistically significantly higher knowledge test scores for modules with questions,49,62 while the third reported no significant difference.39 Another randomized trial reported similar outcomes whether or not learners were required to actively respond to the question.40 A fifth randomized trial compared IBL modules with case-based questions and tailored feedback against an online practice guideline without questions and found significant improvement in knowledge test scores for interactive IBL.35 Adding extra questions before73 or after70 an Internet-based module that already contained several interactive questions did not significantly alter outcomes in two randomized trials. Finally, a nonrandomized study added case-based questions and learning tasks to an Internet-based course consisting of text, video clips, and interactive models and found a nonsignificant association between lower interactivity (no questions) and higher knowledge test scores, but similar satisfaction ratings.50
Two randomized trials evaluated the effect of actively summarizing information. Creating a summary of a patient scenario improved knowledge test scores compared with not creating a summary.56 However, creating a summary of a tutorial's didactic information did not improve knowledge test scores in comparison with reviewing an instructor-prepared summary, and preference was neutral.59
A small randomized trial found a modest (ES 0.57) but nonsignificant benefit of Internet-based modules with self-evaluation, animations, and video in comparison with static PDFs of the same information.57 Studying Internet-based, worked-example cases with intentional errors led to significantly improved learning outcomes (knowledge, skills, or behaviors and patient effects) compared with cases without errors in another randomized trial.74
Finally, three studies evaluated Internet-based tutorials with varying levels of interactivity, but the differences were poorly defined. One randomized trial compared a multicomponent module with tailored feedback on clinical practice performance against “flat-text Internet-based CME [continuing medical education] modules,”43 while a crossover study compared “interactive Web-based modules” with “noninteractive narrated slide presentations.”52 Both found significant improvements in learning outcomes for the interactive group. A third study with ambiguous design found a significant association between knowledge scores and use of an interactive IBL game versus noninteractive “traditional” computer-assisted instruction.68
For the 15 studies reporting learning outcomes (knowledge, skills, or behaviors and patient effects), the pooled ES favoring interactivity was 0.27 (95% confidence interval [CI], 0.08 to 0.46; P = .006), I2 = 90%. Although statistically significantly different from 0, this is considered a small effect.77 Seven studies investigating interactivity reported satisfaction outcomes, with a pooled ES of 0.39 (95% CI, −0.12 to 0.90; P = .13), I2 = 95%. Subgroup analyses showed similar findings for high- and low-quality studies (Supplemental Digital Content, Table 2, http://links.lww.com/ACADMED/A14).
A second meta-analysis pooled data from 21 studies in which the coded level of interactivity varied between study arms.28–30,33,35–37,39,40,47,49,50,52,54–58,62,68,74 In this coding, learner tasks such as practice exercises, information syntheses, essay assignments, and group collaborative projects supported higher interactivity levels. In this analysis, the pooled ES for learning was 0.53 (95% CI, 0.33 to 0.73; P < .001), I2 = 89%. This is considered a medium-sized effect.77 Twelve of these studies evaluated satisfaction,29,34,37,39,40,47–50,53,58,62 with a pooled ES of 0.31 (95% CI, −0.13 to 0.74; P = .17), I2 = 92%.
Studies investigating practice exercises
Although not identified as a separate research theme, the study protocol specified coding the quantity of practice exercises, and this rating varied between study arms for 10 studies.27,29,35,39,43,49,50,55,57,62 Meta-analysis revealed a pooled ES of 0.40 (95% CI, 0.08 to 0.71; P = .01), I2 = 92%. Subgroup analyses demonstrated similar findings for high- and low-quality studies (see Supplemental Digital Content, Table 2, http://links.lww.com/ACADMED/A14). For the five studies reporting satisfaction outcomes, the pooled ES was 0.59 (95% CI, −0.10 to 1.27; P = .094), I2 = 95%. However, when including only randomized trials, this ES was statistically significantly different from 0 (P = .005; see Supplemental Digital Content, Table 2, http://links.lww.com/ACADMED/A14).
Studies investigating online discussion
Seven studies evaluated the impact of online discussion.27,29,32,37,38,48,53 Five of these used written asynchronous text-based discussion (e.g., discussion boards or e-mail). Three randomized trials compared Internet-based tutorials with and without online discussion.27,32,37 None of these three studies demonstrated a statistically significant effect on learning outcomes, although students in one study noted significantly higher satisfaction with the online discussion format.37 A fourth study added a “virtual clinic” (in which students discussed patient cases online) to an existing IBL activity. A comparison with historical controls observed no association with course ratings but a significant association with higher test scores.29 Finally, one study altered a course to promote greater online collaboration and found greater satisfaction among trainees compared with historical controls using a previous course in which online discussion was less prominent.48
Two observational studies evaluated the addition of live audio discussion to Internet-based courses without discussion. One used historical controls,53 whereas the other allowed participants to self-select the presentation format.38 Both studies found an association between live audio discussion and higher satisfaction.
For the three studies of online discussion to report learning outcomes, the pooled ES was 0.26 (95% CI, −0.62 to 1.13; P = .57), I2 = 93%. In subgroup analyses, the pooled ES varied by study design (Pinteraction <.001), with lower pooled ES for the two randomized trials (−0.14) than for the lone nonrandomized study (1.01; see Supplemental Digital Content, Table 2, http://links.lww.com/ACADMED/A14, for details). For the five studies reporting satisfaction outcomes, the pooled ES was 0.32 (95% CI, 0.14 to 0.51; P < .001) with I2 = 4% and similar effects for quality subgroups.
Studies investigating feedback
Two randomized studies28,74 examined the effects of feedback. One study compared feedback with no feedback,28 whereas the other compared detailed feedback versus providing only the correct answer.74 The results of both studies favored the more intensive feedback option, with a pooled learning ES of 0.68 (95% CI, 0.01–1.35; P = .047).
A number of other studies used feedback as part of the instructional design, most often in conjunction with self-assessment questions35,39,40,49,59,70 but also regarding clinical practice performance43 and explicit feedback from peers.48 Unfortunately, simultaneous changes in other instructional design features (confounding) precluded isolating the effect of feedback in these studies.
Studies investigating repetition and spacing
Several studies addressed the theme of spacing (spreading a fixed amount of instruction over time) and repetition (repeating the same instructional activities multiple times). For spacing, one randomized trial compared spreading 40 short IBL modules across 10 weeks using e-mail, versus having all 40 modules available in the first week.51 The small benefits (ES 0.12 for both knowledge test scores and course completion rates) were not statistically significant. Another randomized study compared a video clip divided into eight small segments against the intact clip and found statistically significant improvements in knowledge test scores (ES 0.88) for the segmented (spaced) format, but there was no difference in satisfaction or skill ratings.47 We did not pool these results because the interventions varied substantially.
Considering repetition, two randomized trials compared delivering study questions and answers via e-mail over several weeks and repeating each question once72 or twice,63 against nonspaced/repeated instruction (either a Web-based module on the same topic,72 or the same questions/answers delivered together on day 163). The pooled learning ES was 0.19 (95% CI, 0.09–0.30; P < .001).
Studies investigating strategies to promote learner participation
Four observational studies evaluated ways to enhance learner participation in Internet-based courses, including providing printed course guides and enhanced technical support to course participants,44 using a Web server instead of a more complicated learning management system,45 modifying the screen appearance to promote participation in online forums,48 and making the course a more integral part of the curriculum.61 All of these interventions were associated with improved outcomes; however, because of the heterogeneity in approaches, we did not pool these results.
Studies investigating audio
Two studies explored the use of audio in Internet-based tutorials. One randomized study compared Internet-accessible PowerPoint presentations using written text or an audio voice-over.41 Although knowledge test scores did not differ significantly between groups, the audio group reported significantly higher satisfaction. However, the audio format also took significantly longer to complete. The other study used a single-group crossover design in which half the Internet-accessible PowerPoint presentations had supplemental audio information that reinforced the text and encouraged further study. This intervention was associated with statistically greater knowledge test scores and satisfaction.66 Pooled analyses of these two studies revealed ESs favoring audio of 1.26 (95% CI, −0.36 to 2.88; P = .13) for learning and 0.76 (95% CI = 0.50 to 1.02; P < .001) for satisfaction.
Another two studies explored the use of audio in Internet-based communication. These crossover studies34,64 found statistically significantly greater preference for Internet-mediated videoconferences over synchronous, online, text-based chat sessions, with a pooled ES of 1.15 (95% CI, 0.15–2.15; P = .02).
Studies investigating instructor-synthesized information
Three randomized trials compared instructor-synthesized information against existing information available on the Internet. One study found a significant benefit on knowledge test scores from an instructor-synthesized series of dermatologic images and brief text.36 Another study found mixed results, with no statistically significant differences for knowledge, skill, or behavior scores, but a large, positive effect on satisfaction for the instructor-synthesized material.46 The third study, comparing IBL modules against an online practice guideline, has already been discussed under Interactivity.35 The pooled learning ES was 1.09 (95% CI, −0.20 to 2.39; P = .10), I2 = 96%.
Studies investigating games and simulation
Three observational studies30,50,68 compared learning inductively from games or simulations versus learning from sequentially presented didactic material, with results favoring didactic,50 favoring games,68 or showing no difference.30 The pooled ES for learning outcomes was 0.07 (95% CI, −0.55 to 0.68; P = .83), I2 = 95%.
Studies investigating adaptive navigation
Two randomized trials evaluated Internet-based interventions that adapted according to learner responses. One compared a “narrative” (adaptive) virtual patient with a “problem-focused” virtual patient31 and found evidence suggesting improved communication skills for the adaptive format. The other71 tested learners' knowledge before presenting information and allowed learners to skip module sections if they answered correctly. Although knowledge test scores did not significantly differ, the adaptive format required significantly less time. Because of the heterogeneity in research themes of these studies, we did not pool these results.
Studies investigating blended learning
Two observational studies compared Internet-only versus blended Internet/face-to-face courses. In one,33 the combination of Internet-mediated and face-to-face discussion was associated with somewhat lower learning outcomes than an Internet-only approach, but multiple variables including cultural differences and language barriers between the learner groups could have contributed to this finding. In another study, students in an Internet-based course could self-select Internet-based or face-to-face discussion groups.69 Those choosing Internet discussion had higher course grades than those selecting face-to-face.
Other research themes
Eight studies26,38,42,44,45,55,61,76 compared different computer-based learning configurations, and a number of themes were addressed by one study each. These are summarized in the Supplemental Digital Content, Box 3, http://links.lww.com/ACADMED/A14.
This systematic review identified a modest number of studies investigating how to improve IBL by comparing one Internet-based intervention with another computer-based intervention. These studies collectively explored a relatively large number of research themes. However, in most cases only a small number of studies had investigated a given research theme. Moreover, the operational definitions of the interventions (and the differences between interventions) varied widely from study to study even within a given theme.
Pooled ESs for satisfaction and/or learning outcomes (knowledge, skills, or behaviors and patient effects) were positive but small77 for associations with nearly all of the themes identified. However, the pooled estimates for satisfaction differed significantly from 0 only for associations with interactivity, online discussion, and use of audio for both tutorials and online discussion, whereas estimates for learning differed significantly only for associations with interactivity, practice exercises, feedback, and repetition. Inconsistency (heterogeneity) between studies was large (≥89%) for all but online discussion and satisfaction. These inconsistencies allow us to draw only weak inferences.
Limitations and strengths
Our review has several limitations. First, only a few studies investigated any given research theme, precluding quantitative synthesis of results in many instances. Second, even within a given theme, the conceptual definitions (e.g., what constitutes “interactivity”?), study and comparison interventions, outcomes, and research methods varied. We emphasized similarities when grouping studies, but we acknowledge that important differences may explain much of the observed inconsistency among studies.78 The small number of studies precluded subgroup analyses to explore this heterogeneity. We did not use funnel plots to assess for publication bias because these are misleading in the presence of large heterogeneity.79 Third, although some studies reflected high methodological quality, the average quality was relatively poor. However, analyses restricted to more rigorous studies yielded similar findings in most instances. Fourth, many articles failed to report key details of the interventions or outcome measures. We obtained additional outcomes data from some but not all authors. Finally, space limitations do not permit a detailed review of the theories and frameworks that support each of the themes identified. However, in our Supplemental Digital Content we illustrate this foundation for one theme (Box 2, http://links.lww.com/ACADMED/A14).
Our review also has several strengths. The question of how to improve IBL is timely and of great importance to medical educators. To present a comprehensive summary of evidence, we kept our scope broad regarding learners, interventions, outcomes, and study design. The systematic literature search encompassed multiple databases supplemented by hand searches and had few exclusion criteria. We conducted all aspects of the review process in duplicate, with acceptable reproducibility.
Comparison with previous reviews
In comparison with our recent reviews identifying 130 studies comparing IBL with no intervention and 76 studies comparing IBL with non-Internet methods,4 the 51 studies identified in this review represent a relatively small body of evidence. This is consistent with a review summarizing only the type of study, which reported that only 1% of reviewed studies on computer-assisted instruction compared alternate, computer-based interventions.80
We are not aware of previous systematic reviews of studies comparing one computer-assisted instructional intervention with another in health professions education, although some previous reviews included such studies along with no-intervention and media-comparative studies.81,82 Outside of medical education, authors have provided nonsystematic summaries83,84 and broadly focused reviews,6,85,86 but again we are not aware of comprehensive and methodologically rigorous syntheses focusing on how to improve computer-assisted instruction. However, several reviews5,6,87 and other authors9–13 have issued a call for more research of this type, suggesting that the present review fills an unmet need.
The synthesized evidence suggests that interactivity, practice exercises, repetition, and feedback improve learning outcomes and that interactivity, online discussion, and audio improve satisfaction in IBL for health professionals. Although educators should consider incorporating these features when designing IBL, the strength of these recommendations is limited: We found relatively few studies; existing studies address a diversity of themes; even within themes, the interventions and outcomes vary; study findings are inconsistent; and methodological quality is relatively low. Clear guidance for practice will require additional research. Insights from outside the health professions may also be useful.84
To strengthen the evidence base, researchers must first come to agreement on what is being studied. Shared conceptual and theoretical frameworks, consistent definitions for interventions and comparison interventions, and the use of common outcome measures may help. Working from shared frameworks, interventions, and outcomes will permit replication across learner groups and different educational objectives. The present summary and synthesis of evidence, along with the research themes identified, form a foundation for such work, and several of the included studies provide exemplary models to follow.
As has been documented previously,88 few studies addressed outcomes of skills, behaviors in practice, or effects on patient care. Such outcomes would be desirable in future research.89,90 However, investigators should ensure that outcomes align with interventions,91 and they might consider demonstrating effects on applied knowledge and skills before evaluating effects on higher-order outcomes.92
Finally, although the evidence summarized here begins to inform the question, “How should we use IBL?” it largely fails to address the question, “When should we use IBL?” Authors have argued that these decisions are largely pragmatic, based on relative advantages of the Internet over other instructional delivery systems.2 Evidence to guide these decisions will derive from studies, including qualitative analyses, designed to clarify relationships between potential advantages and specific topics, course objectives, and learner characteristics.13,93
Although existing evidence does not permit strong recommendations for educational practice, this review has highlighted promising areas for future research. Evidence will derive from multiple sources, including randomized and observational quantitative studies and rigorous qualitative research. Clear conceptual frameworks, focused research questions, well-defined interventions, study methods appropriate to the question, and adherence to reporting standards when disseminating results all will help advance the science of IBL.
The authors thank Melanie Lane, Mohamed Elamin, MBBS, and M. Hassan Murad, MD, from the Knowledge and Encounter Research Unit, Mayo Clinic, Rochester, Minnesota, for their assistance with data extraction and meta-analysis planning and execution, and Kathryn Trana, from the Division of General Internal Medicine, Mayo Clinic, for assistance in article acquisition and processing.
This work was supported by intramural funds and a Mayo Foundation Education Innovation award.
Dr. Levinson is supported in part through the John R. Evans Chair in Health Sciences Education Research.
Because no human participants were involved, this study did not require ethical approval.
Select portions of these results were presented in symposia at the 2009 meetings of MedBiquitous (April 29, 2009; Baltimore, Maryland), the Association of Medical Education in Europe (August 31, 2009; Malaga, Spain), and the Association of American Medical Colleges (November 9, 2009; Boston, Massachusetts).
Dr. Cook had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
1 Ruiz JG, Mintzer MJ, Leipzig RM. The impact of E-learning in medical education. Acad Med. 2006;81:207–212.
2 Cook DA. Web-based learning: Pros, cons, and controversies. Clin Med. 2007;7:37–42.
4 Cook DA, Levinson AJ, Garside S, Dupras DM, Erwin PJ, Montori VM. Internet-based learning in the health professions: A meta-analysis. JAMA. 2008;300:1181–1196.
5 Chumley-Jones HS, Dobbie A, Alford CL. Web-based learning: Sound educational method or hype? A review of the evaluation literature. Acad Med. 2002;77(10 suppl):S86–S93.
6 Tallent-Runnels MK, Thomas JA, Lan WY, Cooper S. Teaching courses online: A review of the research. Rev Educ Res. 2006;76:93–135.
7 Cohen PA, Dacanay LD. Computer-based instruction and health professions education: A meta-analysis of outcomes. Eval Health Prof. 1992;15:259–281.
8 Cook DA. Where are we with Web-based learning in medical education? Med Teach. 2006;28:594–598.
9 Clark RE. Reconsidering research on learning from media. Rev Educ Res. 1983;53:445–459.
10 Keane DR, Norman GR, Vickers J. The inadequacy of recent research on computer-assisted instruction. Acad Med. 1991;66:444–448.
11 Friedman CP. The research we should be doing. Acad Med. 1994;69:455–457.
12 Tegtmeyer K, Ibsen L, Goldstein B. Computer-assisted learning in critical care: From ENIAC to HAL. Crit Care Med. 2001;29(8 suppl):N177–N182.
13 Cook DA. The research we still are not doing: An agenda for the study of computer-based learning. Acad Med. 2005;80:541–548.
14 Moher D, Cook DJ, Eastwood S, Olkin I, Rennie D, Stroup DF. Improving the quality of reports of meta-analyses of randomised controlled trials: The QUOROM statement. Quality of Reporting of Meta-analyses. Lancet. 1999;354:1896–1900.
15 Stroup DF, Berlin JA, Morton SC, et al. Meta-analysis of observational studies in epidemiology: A proposal for Reporting. Meta-analysis of Observational Studies in Epidemiology (MOOSE) Group. JAMA. 2000;283:2008–2012.
16 Kirkpatrick D. Revisiting Kirkpatrick's four-level model. Training Dev. 1996;50:54–59.
17 Shrout PE, Fleiss JL. Intraclass correlations: Uses in assessing rater reliability. Psychol Bull. 1979;86:420–428.
19 Morris SB, DeShon RP. Combining effect size estimates in meta-analysis with repeated measures and independent-groups designs. Psychol Methods. 2002;7:105–125.
20 Dunlap WP, Cortina JM, Vaslow JB, Burke MJ. Meta-analysis of experiments with matched groups or repeated measures designs. Psychol Methods. 1996;1:170–177.
21 Hunter JE, Schmidt FL. Methods of Meta-analysis: Correcting Error and Bias in Research Findings. Thousand Oaks, Calif: Sage; 2004.
23 Rosenthal R. Parametric measures of effect size. In: Cooper HM, Hedges LV, eds. The Handbook of Research Synthesis. New York, NY: Russell Sage Foundation; 1994:231–244.
24 Curtin F, Altman DG, Elbourne D. Meta-analysis combining parallel and cross-over clinical trials. I: Continuous outcomes. Stat Med. 2002;21:2131–2144.
25 Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ. 2003;327:557–560.
26 Kaplan IP, Patton LR, Hamilton RA. Adaptation of different computerized methods of distance learning to an external pharmd degree program. Am J Pharm Educ. 1996;60:422–425.
27 Chan DH, Leclair K, Kaczorowski J. Problem-based small-group learning via the Internet among community family physicians: A randomized controlled trial. MD Comput. 1999;16:54–58.
28 Papa FJ, Aldrich D, Schumacker RE. The effects of immediate online feedback upon diagnostic performance. Acad Med. 1999;74(10 suppl):S16–S18.
29 Schaad DC, Walker EA, Wolf FM, Brock DM, Thielke SM, Oberg L. Evaluating the serial migration of an existing required course to the World Wide Web. Acad Med. 1999;74(10 suppl):S84–S86.
30 Scherly D, Roux L, Dillenbourg P. Evaluation of hypertext in an activity learning environment. J Comput Assist Learning. 2000;16:125–136.
31 Bearman M, Cesnik B, Liddell M. Random comparison of ‘virtual patient’ models in the context of teaching clinical communication skills. Med Educ. 2001;35:824–832.
32 Fox N, O'Rourke A, Roberts C, Walker J. Change management in primary care: Design and evaluation of an Internet-delivered course. Med Educ. 2001;35:803–805.
33 Duffy T, Gilbert I, Kennedy D, Wa KP. Comparing distance education and conventional education: Observations from a comparative study of post-registration nurses. Assoc Learning Technol J. 2002;10:70–82.
34 Jedlicka JS, Brown SW, Bunch AE, Jaffe LE. A comparison of distance education instructional methods in occupational therapy. J Allied Health. 2002;31:247–251.
35 Carpenter KM, Watson JM, Raffety B, Chabal C. Teaching brief interventions for smoking cessation via an interactive computer-based tutorial. J Health Psychol. 2003;8:149–160.
36 Chao LW, Enokihara MY, Silveira PS, Gomes SR, Böhm GM. Telemedicine model for training non-medical persons in the early recognition of melanoma. J Telemed Telecare. 2003;9(suppl 1):S4–S7.
37 Frith KH, Kee CC. The effect of communication on nursing student outcomes in a Web-based course. J Nurs Educ. 2003;42:350–358.
38 Lemaire ED, Greene G. A comparison between three electronic media and in-person learning for continuing education in physical rehabilitation. J Telemed Telecare. 2003;9:17–22.
39 Maag M. The effectiveness of an interactive multimedia learning tool on nursing students' math knowledge and self-efficacy. Comput Inform Nurs. 2004;22:26–33.
40 Mattheos N, Nattestad A, Christersson C, Jansson H, Attström R. The effects of an interactive software application on the self-assessment ability of dental students. Eur J Dent Educ. 2004;8:97–104.
41 Spickard A 3rd, Smithers J, Cordray D, Gigante J, Wofford JL. A randomised trial of an online lecture with and without audio. Med Educ. 2004;38:787–790.
42 Virvou M, Alepis E. Mobile versus desktop facilities for an e-learning system: Users' perspective. In: Conference Proceedings—The Institute of Electrical and Electronics Engineers (IEEE) International Conference on Systems, Man and Cybernetics. The Hague, The Netherlands: IEEE; 2004;1:48–52.
43 Allison JJ, Kiefe CI, Wall T, et al. Multicomponent Internet continuing medical education to promote chlamydia screening. Am J Prev Med. 2005;28:285–290.
44 Becker EA, Godwin EM. Methods to improve teaching interdisciplinary teamwork through computer conferencing. J Allied Health. 2005;34:169–176.
45 Brunetaud JM, Leroy N, Pelayo S, et al. Comparative evaluation of two applications for delivering a multimedia medical course in the French-speaking Virtual Medical University (UMVF). Int J Med Inform. 2005;74:209–212.
46 Mukohara K, Schwartz MD. Electronic delivery of research summaries for academic generalist doctors: A randomised trial of an educational intervention. Med Educ. 2005;39:402–409.
47 Schittek Janda M, Tani Botticelli A, Mattheos N, et al. Computer-mediated instructional video: A randomised controlled trial comparing a sequential and a segmented instructional video in surgical hand wash. Eur J Dent Educ. 2005;9:53–58.
48 Blackmore C, Tantam D, Van Deurzen E. The role of the eTutor—Evaluating tutor input in a virtual learning community for psychotherapists and psychologists across Europe. Int J Psychother. 2006;10:35–46.
49 Cook DA, Thompson WG, Thomas KG, Thomas MR, Pankratz VS. Impact of self-assessment questions and learning styles in Web-based learning: A randomized, controlled, crossover trial. Acad Med. 2006;81:231–238.
50 Friedl R, Höppler H, Ecard K, et al. Comparative evaluation of multimedia driven, interactive, and case-based teaching in heart surgery. Ann Thorac Surg. 2006;82:1790–1795.
51 Kemper KJ, Gardiner P, Gobble J, Mitra A, Woods C. Randomized controlled trial comparing four strategies for delivering e-curriculum to health care professionals. BMC Med Educ. 2006;6:2.
52 Kerfoot BP, Conlin PR, McMahon GT. Comparison of delivery modes for online medical education. Med Educ. 2006;40:1137–1138.
53 Little BB, Passmore D, Schullo S. Using synchronous software in Web-based nursing courses. Comput Inform Nurs. 2006;24:317–325.
54 Nicholson DT, Chalk C, Funnell WRJ, Daniel SJ. Can virtual reality improve anatomy education? A randomised controlled study of a computer-generated three-dimensional anatomical ear model. Med Educ. 2006;40:1081–1087.
55 Romanov K, Nevgi A. Learning outcomes in medical informatics: Comparison of a WebCT course with ordinary Web site learning material. Int J Med Inform. 2006;75:156–162.
56 Thompson GA, Morrison RG, Holyoak KJ, Clark TK. Evaluation of an online analogical patient simulation program. In: Proceedings of the Nineteenth IEEE International Symposium on Computer-Based Medical Systems. Los Alamitos, Calif: IEEE Computer Society Press; 2006:623–628.
57 Al-Rawi WT, Jacobs R, Hassan BA, Sanderink G, Scarfe WC. Evaluation of Web-based instruction for anatomical interpretation in maxillofacial cone beam computed tomography. Dentomaxillofac Radiol. 2007;36:459–464.
58 Bednar ED, Hannum WM, Firestone A, Silveira AM, Cox TD, Proffit WR. Application of distance learning to interactive seminar instruction in orthodontic residency programs. Am J Orthod Dentofacial Orthop. 2007;132:586–594.
59 Cook DA, Gelula MH, Dupras DM, Schwartz A. Instructional methods and cognitive and learning styles in Web-based learning: Report of two randomised trials. Med Educ. 2007;41:897–905.
60 Cook DA, Thompson WG, Thomas KG, Thomas MR. Lack of interaction between sensing–intuitive learning styles and problem-first versus information-first instruction: A randomized crossover trial. Adv Health Sci Educ Theory Pract. 2009;14:79–90.
61 Hege I, Ropp V, Adler M, et al. Experiences with different integration strategies of case-based e-learning. Med Teach. 2007;29:791–797.
62 Kerfoot BP, DeWolf WC, Masser BA, Church PA, Federman DD. Spaced education improves the retention of clinical knowledge by medical students: A randomised controlled trial. Med Educ. 2007;41:23–31.
63 Kerfoot BP, Baker HE, Koch MO, Connelly D, Joseph DB, Ritchey ML. Randomized, controlled trial of spaced education to urology residents in the United States and Canada. J Urol. 2007;177:1481–1487.
64 Miller KT, Hannum WM, Morley T, Proffit WR. Use of recorded interactive seminars in orthodontic distance education. Am J Orthod Dentofacial Orthop. 2007;132:408–414.
65 Moridani M. Asynchronous video streaming vs. synchronous videoconferencing for teaching a pharmacogenetic pharmacotherapy course. Am J Pharm Educ. 2007;71:16.
66 Ridgway PF, Sheikh A, Sweeney KJ, et al. Surgical e-learning: Validation of multimedia Web-based lectures. Med Educ. 2007;41:168–172.
67 Romanov K, Nevgi A. Do medical students watch video clips in eLearning and do these facilitate learning? Med Teach. 2007;29:484–488.
68 Bergeron BP. Learning & retention in adaptive serious games. Stud Health Technol Inform. 2008;132:26–30.
69 Campbell M, Gibson W, Hall A, Richards D, Callery P. Online vs. face-to-face discussion in a Web-based research methods course for postgraduate nursing students: A quasi-experimental study. Int J Nurs Stud. 2008;45:750–759.
70 Cook DA, Beckman TJ, Thomas KG, Thompson WG. Introducing resident doctors to complexity in ambulatory medicine. Med Educ. 2008;42:838–848.
71 Cook DA, Beckman TJ, Thomas KG, Thompson WG. Adapting Web-based instruction to residents' knowledge improves learning efficiency: A randomized controlled trial. J Gen Intern Med. 2008;23:985–990.
72 Kerfoot BP. Interactive spaced education versus Web based modules for teaching urology to medical students: A randomized controlled trial. J Urol. 2008;179:2351–2356.
73 Kerfoot BP, Brotschi E. Online spaced education to teach urology to medical students: A multi-institutional randomized trial. Am J Surg. 2009;197:89–95.
74 Kopp V, Stark R, Fischer MR. Fostering diagnostic knowledge through computer-supported, case-based worked examples: Effects of erroneous examples and feedback. Med Educ. 2008;42:823–829.
75 Tunuguntla R, Rodriguez O, Ruiz JG, et al. Computer-based animations and static graphics as medical student aids in learning home safety assessment: A randomized controlled trial. Med Teach. 2008;30:815–817.
76 Walker BL, Harrington SS. Computer-based instruction and the Web for delivering injury prevention training. Educ Gerontol. 2008;34:691–708.
77 Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Hillsdale, NJ: Lawrence Erlbaum; 1988.
78 Colliver JA, Kucera K, Verhulst SJ. Meta-analysis of quasi-experimental research: Are systematic narrative reviews indicated? Med Educ. 2008;42:858–865.
79 Lau J, Ioannidis JP, Terrin N, Schmid CH, Olkin I. The case of the misleading funnel plot. BMJ. 2006;333:597–600.
80 Adler MD, Johnson KB. Quantifying the literature of computer-aided instruction in medical education. Acad Med. 2000;75:1025–1028.
81 Greenhalgh T. Computer assisted learning in undergraduate medical education. BMJ. 2001;322:40–44.
82 Wutoh R, Boren SA, Balas EA. eLearning: A review of Internet-based continuing medical education. J Contin Educ Health Prof. 2004;24:20–30.
83 Dillon A, Gabbard RB. Hypermedia as an educational technology: A review of the quantitative research literature on learner comprehension, control, and style. Rev Educ Res. 1998;68:322–349.
84 Clark RC, Mayer RE. E-Learning and the Science of Instruction: Proven Guidelines for Consumers and Designers of Multimedia Learning. 2nd ed. San Francisco, Calif: Pfeiffer; 2007.
85 Kulik C-LC, Kulik JA, Shwalb BJ. The effectiveness of computer-based adult education: A meta-analysis. J Educ Comput Res. 1986;2:235–252.
86 Fletcher-Flinn CM, Gravatt B. The efficacy of computer-assisted instruction (CAI): A meta-analysis. J Educ Comput Res. 1995;12:219–242.
87 Lewis MJ, Davies R, Jenkins D, Tait MI. A review of evaluative studies of computer-based learning in nursing education. Nurse Educ Today. 2001;21:26–37.
88 Curran VR, Fleet L. A review of evaluation outcomes of Web-based continuing medical education. Med Educ. 2005;39:561–567.
89 Prystowsky JB, Bordage G. An outcomes research perspective on medical education: The predominance of trainee assessment and satisfaction. Med Educ. 2001;35:331–336.
90 Chen FM, Bauchner H, Burstin H. A call for outcomes research in medical education. Acad Med. 2004;79:955–960.
91 Cook DA, Bowen JL, Gerrity MS, et al. Proposed standards for medical education submissions to the Journal of General Internal Medicine. J Gen Intern Med. 2008;23:908–913.
92 Shea JA. Mind the gap: Some reasons why medical education research is different from health services research. Med Educ. 2001;35:319–320.
93 Cook DA. The failure of e-learning research to inform educational practice, and what we can do about it. Med Teach. 2009;31:158–162.