ADVANTAGES, DISADVANTAGES, AND BIAS
As a research design, the systematic review offers many inherent advantages. As discussed above, this type of analysis can reduce the bias and random error found in many narrative reviews and can help synthesize data from smaller studies to reach statistical significance. Furthermore, because the methods for conducting a systematic review are both transparent and reproducible, any given study can be easily updated as new data are published.13,14
Systematic reviews are highly ranked among the levels of evidence for published studies (Table 2)15,16 and are also cited with greater frequency than other types of studies. In fact, when evaluating studies of treatment effects, meta-analyses were found to be cited even more than randomized controlled trials.17
The disadvantages of systematic reviews are related to sources of bias—primarily publication bias, citation bias, and language bias. Although biases can be minimized with careful study design, there are certain types of bias that are inherently problematic when performing a systematic review. The most important bias to consider is publication bias.18 This refers to the selective publication of studies with positive or statistically significant results.19 For instance, it has been shown that studies with positive results are three to eight times more likely to be published.20 Furthermore, studies with positive results are more likely to be published in journals with higher citation indexes, and they are more likely to generate multiple publications than negative studies. The obvious concern is that well-designed studies—even randomized controlled trials—that do not have statistically significant results may not be published, and a review of the published literature will therefore overestimate the positive effect of a particular intervention.
Many solutions to publication bias have been proposed. First, a comprehensive search of unpublished studies should be included whenever a systematic review is conducted. This includes researching conference proceedings, correspondence with experts in the field, and review of the appropriate clinical trial registries. Both the National Institutes of Health21 and the Cochrane Collaboration22 maintain clinical trial registries. Many journals now require that clinical trials be prospectively registered before the enrollment of patients.23 It is only with prospective registration of all clinical trials that we can truly capture all existing data and completely eliminate publication bias.
Citation bias is closely related to publication bias. This phenomenon occurs when the chance of a study being cited by others is associated with its results. This has worked both ways: some studies have found that positive studies are cited more by others,24 whereas other studies have showed the opposite.25
Another important source of bias is language bias. This form of bias occurs when the language(s) of publication depend on the study's results.18 For example, it has been shown that authors were more likely to publish randomized controlled trials in an English-language journal, as opposed to a German-language journal, if the results were statistically significant.26 If a systematic review were then conducted that excluded non-English journals, the conclusion of the review would be unfairly biased toward a statistically significant result. This is a significant source of bias, as non-English publications are excluded in the majority of meta-analyses published in English-language journals.27 To eliminate language bias, the review must make significant efforts to obtain non-English language references and have them appropriately translated to include the data in the review.
Systematic reviews may be designed to include only randomized controlled trials,9 or they can be constructed to include observational studies, such as prospective cohort studies, case-control studies, cross-sectional studies, or case series.4 In fact, systematic reviews of reviews have been published.28,29 The ultimate level of evidence achieved by a systematic review depends on the quality of the studies that are included in the review (Table 2). Systematic reviews based on observational studies are subject to additional bias due to the nonrandomization of study subjects. Because of this, patient selection bias and treatment allocation bias present in the original observational studies can affect the outcome of the systematic review of those studies.
One last word of caution applies to meta-analysis. As indicated above, meta-analysis is essentially a statistical manipulation of data pooled from other studies. Data from individual studies should only be pooled for meta-analysis if the studies are both independent and similar. Increased heterogeneity between the individual studies will lead to inaccuracy of the meta-analysis. Because of the additional complexity involved in adding meta-analysis to a systematic review, it is strongly recommended that a statistician or epidemiologist be included in the planning and execution of this type of study.
CONDUCTING A SYSTEMATIC REVIEW
The proper steps for conducting a systematic review are outlined in Table 3. Although each of these will be discussed in this article, those wishing to learn more are referred to the original text by Egger and Smith, from which this list was extracted.30 It is helpful to think of a systematic review as an observational study of the available evidence. As with any research study, the protocol should be written in advance to avoid introducing bias along the way. For example, if studies are uncovered with unexpected results, one must avoid post hoc modification of the inclusion or exclusion criteria to exclude that particular study. Typically, the a priori determined review protocol should include steps 1 through 7 from Table 3.
Step 1: Formulate Review Question
A systematic review usually sets out to answer a question comparing one treatment with another (including success rates and complications) or to look at how exposure leads to disease. To define the research question one is interested in, one should ask four key questions about the purposed study:
- What is the population of interest?
- What interventions are being considered?
- What are the outcomes of interest?
- What study designs are appropriate to answer this question?
As these questions are answered, the scope of the review is defined, and the inclusion and exclusion criteria become self-evident. Let's consider the example of limb salvage for severe open tibial fractures, a topic recently addressed in a systematic review from our institution.12 For this study, the review question was: “Considering patients with severe open tibial fractures, which option (amputation versus salvage) provides the lowest complication rates, quickest return to work, and best quality of life?”
Step 2: Define Inclusion and Exclusion Criteria
As the research question is developed, the focus of the question determines the inclusion and exclusion criteria. If the question is too narrow, too few studies may be included, yielding a review with very low precision. If the question, on the other hand, is too broad, it will capture many more studies than is ideal, and one might lose the ability to detect differences in subgroup analysis.
Table 4 lists specific inclusion and exclusion criteria from the example study on limb salvage. Note that some of the criteria are very specific, helping to limit the scope of the review, so the studies included will directly help answer the review question. The decision was made to include prospective and retrospective observational studies in this review but not case series, technique, or review articles.
Step 3: Locate Studies
For any systematic review, the literature search employed must be both comprehensive and reproducible. In theory, the search should uncover all existing studies concerning the topic of interest. If studies are omitted, bias can be introduced. Some reasons for omissions include inadequate search criteria, inaccurate indexing of articles, and inaccessibility of some journals. The exact search process needs to be meticulously documented so that the search is both transparent and reproducible. This allows the search to be easily verified or updated in the future.
Traditionally, MEDLINE and EMBASE have been the primary resources for searching the medical literature. MEDLINE indexes primarily U.S. (and primarily English-language) journals; EMBASE has better European (and non–English-language) journal coverage. Overlap between these two databases is around 20 to 30 percent. Although these “traditional” databases may still be useful for identifying recently published studies, there is a more comprehensive resource for identifying published clinical trials: The Cochrane Central Register of Controlled Trials (CENTRAL).31
Formerly known as the Cochrane Controlled Trials Register (CCTR), CENTRAL now contains hundreds of thousands of records, making it the best single source of published trials available. This massive resource has been compiled by the Cochrane Collaboration (www.cochrane.org) and is continuously updated as new studies are published and as older studies are identified and indexed by their reviewers.32
In addition to the electronic databases, a thorough literature search should expand to other sources via hand searching (Table 5).33 This includes reviewing reference lists from textbooks, narrative reviews, or expert opinion papers on the topic of interest. In addition, conference proceeding and clinical trial registries can be examined to help discover unpublished data or data pending publication. This combination of endeavors should produce a comprehensive list of article to be reviewed.
Step 4: Select Studies
At this step, the inclusion and exclusion criteria are applied to the list of located articles. This is typically done in several stages. First, a review of the article titles is performed to select all studies that may potentially meet the inclusion criteria. Along the way, duplicate articles will be identified and eliminated.
Next, the abstracts of the studies identified in the title search are obtained and reviewed. Once again, studies that meet the inclusion criteria are identified, and others are eliminated. Studies whose abstracts meet the inclusion criteria are retrieved, and the full text is analyzed. At this final stage, articles are eliminated based on exclusion criteria.
In theory, all of these stages should be performed independently by at least two qualified expert reviewers. Interrater agreement can be measured statistically by Cohen's kappa coefficient. Disagreements between reviewers can be settled by consensus agreement to help maximize reliability and reproducibility. This approach minimizes biases and omissions.2
Authors should track and report the number of articles at each step, and these data should be portrayed in a figure such as the one in our example (Fig. 2). At the final stage (full text review), it is helpful to report the number of articles eliminated by each of the exclusion criteria, to make this process transparent to the reader.
Step 5: Assess Study Quality
It is important to assess the quality of the individual studies selected, as flaws in those studies could distort the results of the meta-analysis or systematic review. There is, however, considerable debate over how this should be done.34 Studies may be given a summary score based on overall quality, and these scores can be used to help weight the studies in terms of clinical significance. Many of these so-called “composite scales” have been developed, reflecting the controversy in this area.35 Most scoring systems that have been developed are for randomized controlled trials; no scoring systems have been developed for retrospective studies. Use of a single composite score to represent study quality can be problematic, however, as selection of one scale over another has been shown to change the results of meta-analysis.36
Recently, a group of international guideline developers has put forth a new scoring system, the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) system.37 This system offers several potential advantages over other systems (Table 6). In the GRADE system, quality of evidence (high, moderate, low, and very low) is reported separately from the grades of recommendations (strong or weak).
Due to the perceived flaws in many of the composite scoring systems, analysis of individual components of study quality has been advised by some experts in the field.38 This method avoids some of the pitfalls of composite scales but can be more labor intensive.
Regardless of whether a composite scale or component analysis is used, the assessment of study quality is typically used to weight individual studies during pooling of the data. This should be done with caution, as there are several potential problems with this approach. For instance, although their influence may be reduced by giving them less weight, the results of poor studies may still be included in the analysis and influence results. The ideal approach may be to use sensitivity analysis to determine the effect of the individual components of study quality on the overall result of the systematic review.38
Step 6: Extract Data
This step requires time, concentration, and meticulous accuracy. The list of data to be extracted should be agreed upon during the design phase of the study. The primary focus of the data collected should be that which is required to prove or disprove the a priori hypothesis. Of course, additional data may be extracted, and additional associations may be explored after testing the a priori hypothesis. One must keep in mind, however, that positive associations can present themselves randomly; therefore, while searching for large numbers of associations, one might find something significant by chance alone.
The best way to be consistent in this process is to use a carefully designed form to enter the data as they are extracted from each study. Ideally, this process should be blinded as to the authors and the sources of each article to avoid bias. Data that are typically extracted include study characteristics, sample demographics, and outcomes data. The data extracted from each study in our limb salvage example are presented in Table 7.12
If practical, this process should be repeated with two or more researchers so that there is a consensus of the extracted data. Once again, interrater reliability can be assessed with Cohen's kappa coefficient. An alternative to this would be to conduct random audits of the process. For example, a random sample of the studies could be reexamined by an independent investigator to confirm accurate data extraction.
Step 7: Analyze and Present Results
At this point, the data can be pooled if appropriate and presented as a summary outcome or effect, and meta-analysis can be performed if indicated. Conceptually, meta-analysis combines the results of similar studies of a particular intervention, taking into account measures of variability within and between the studies, to improve the validity of the conclusion. If the studies that are being reviewed have a high degree of heterogeneity, the author may not be able to combine the data, and a meta-analysis should not be performed but rather a narrative summary given instead.
In our limb salvage example, the outcomes data that were extracted were not reported consistently using standardized measures, so meta-analysis was not possible. Data were, however, able to be extracted and summarized to effectively compare the two outcomes of interest: limb salvage versus amputation (Table 8).12
Presentation of the data should include a tabulation of the results from the individual studies. Ideally, a forest plot should be used to compare the results of the individual studies (Fig. 3). This type of graph shows not only the data extracted from individual studies, but also a representation of the statistical weight of each study, with regard to confidence intervals and standard error of the mean.
Formal guidelines exist regarding the reporting of systematic reviews. Each of these guidelines is in the form of a checklist: a list of items that should be included whenever reporting a systematic review. The MOOSE report gives a detailed checklist (Table 9) for reporting meta-analyses of observational studies.4 The PRISMA checklist3 (Table 10) pertains to meta-analyses of randomized controlled trials; this set of guidelines replaces the older Quality of Reporting of Meta-analyses (QUOROM) guidelines.9 An exhaustive explanation and elaboration of the PRISMA guidelines has been published for the interested reader.40
Step 8: Interpret Results
Finally, a systematic review should interpret the results that have been presented. This should include a discussion of the limitations of the study, including potential biases in the original articles, as well as biases that may have influenced the review itself (Table 4). Also, the strength of the evidence should be reviewed; strength and applicability of the review findings will depend on the data upon which it is based. Finally, directions for future research will be evident, especially if the review has discovered a significant heterogeneity between the studies on the topic of interest.
USING SYSTEMATIC REVIEWS IN CLINICAL PRACTICE
Using high-level evidence from systematic reviews to treat individual patients at the bedside represents the pinnacle of evidence-based medicine. Several key questions should be considered before applying the results of systematic reviews to individual patients.41
- Does this evidence apply to this particular patient? One should consider the disease pathogenesis and patient-specific factors as well as any differences in environmental factors when answering this question. Your patient may not have to meet all the inclusion and exclusion criteria for a particular study's results to be applicable. Remember that differences between your own patients and those in trials tend to be quantitative (e.g., matters of degree in risk and responsiveness) rather than qualitative (no response or adverse response).42
- Is this intervention feasible for this particular patient? Regional differences in the availability and affordability of a given intervention will influence its use in individual patients. One should also consider whether the required experience or expertise is available in the region to effectively administer the intervention and/or monitor its results over time.
- What is the risk:benefit ratio for this particular patient? If the results of the systematic review seem both applicable and feasible, one must still consider if the benefits outweigh the risks for this individual patient. The overall estimate of clinical effect must be derived from the article, and that result extrapolated to the individual patient. A statistical analysis could take the form of a “number needed to treat” calculation, the scope of which is beyond this article.41
- What are the values and preferences of this particular patient? Finally, we should not base treatment decisions on laboratory results or radiographs, but on the patient as an individual. Patients have innate values and preferences that should be considered in making all treatment decisions.
Systematic reviews can help us synthesize information, improve the significance of data from smaller studies, and identify areas where further research is required. They are an important tool to help us practice evidence-based medicine in this modern age. This article provides a step-by-step guide to conducting a proper systematic review, including new guidelines not covered in previous reports on this topic.2,43 Results, however, may be biased despite careful methodology, and application of findings to individual patients will always require thoughtful analysis by the physician providing care.
1. Sackett DL, Rosenberg WM, Gray JA, Haynes RB, Richardson WS. Evidence based medicine: What it is and what it isn't. BMJ.
2. Margaliot Z, Chung KC. Systematic reviews: A primer for plastic surgery research. Plast Reconstr Surg.
3. Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. J Clin Epidemiol.
4. Stroup DF, Berlin JA, Morton SC, et al. Meta-analysis of observational studies in epidemiology: A proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA.
5. Mulrow CD. The medical review article: State of the science. Ann Intern Med.
6. Chalmers I, Altman D, eds. Systematic Reviews.
London: BMJ Publishing Group; 1995.
7. Green S, Higgins JPT, Alderson P, Clarke M, Mulrow CD, Oxman AD. What is a systematic review? In: Higgins JPT, Green S, eds. Cochrane Handbook for Systematic Reviews of Interventions, Version 5.0.0. The Cochrane Collaboration; updated February 2008.
Available at: http://www.cochrane-handbook.org
. Accessed May 6, 2009.
8. Oxman AD, Guyatt GH. Guidelines for reading literature reviews. CMAJ.
9. Moher D, Cook DJ, Eastwood S, Olkin I, Rennie D, Stroup DF. Improving the quality of reports of meta-analyses of randomised controlled trials: The QUOROM statement. Lancet
10. Glass GV. Primary, secondary and meta-analysis of research. Educ Res.
11. Egger M, Smith GD, O'Rourke K. Rationale, potentials, and promise of systematic reviews. In: Egger M, Smith GD, Altman DG, eds. Systematic Reviews in Health Care: Meta-Analysis in Context. 2nd ed.
London: BMJ Publishing Group; 2001:3–19.
12. Saddawi-Konefka D, Kim HM, Chung KC. A systematic review of outcomes and complications of reconstruction and amputation for type IIIB and IIIC fractures of the tibia. Plast Reconstr Surg.
13. Shea B, Boers M, Grimshaw JM, et al. Does updating improve the methodological and reporting quality of systematic reviews? BMC Med Res Methodol.
14. Sutton AJ, Donegan S, Takwoingi Y, Garner P, Gamble C, Donald A. An encouraging assessment of methods to inform priorities for updating systematic reviews. J Clin Epidemiol.
16. Harbour R, Miller J. A new system for grading recommendations in evidence based guidelines. BMJ.
17. Patsopoulos NA, Analatos AA, Ioannidis JP. Relative citation impact of various study designs in the health sciences. JAMA.
18. Song F, Eastwood AJ, Gilbody S, Duley L, Sutton AJ. Publication and related biases. Health Technol Assess.
19. Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR. Publication bias in clinical research. Lancet
20. Dickersin K, Min YI. Publication bias: The problem that won't go away. Ann N Y Acad Sci.
21. ClinicalTrials.gov: A service of the U.S. National Institutes of Health. Available at: http://clinicaltrials.gov
. Accessed October 6, 2009.
23. Zarin DA, Ide NC, Tse T, Harlan WR, West JC, Lindberg DA. Issues in the registration of clinical trials. JAMA.
24. Ravnskov U. Cholesterol lowering trials in coronary heart disease: Frequency of citation and outcome. BMJ.
25. Christensen-Szalanski JJJ, Beach LR. The citation bias: Fad and fashion in the judgment and decision literature. Am Psychol.
26. Egger M, Zellweger-Zahner T, Schneider M, Junker C, Lengeler C, Antes G. Language bias in randomised controlled trials published in English and German. Lancet
27. Egger M, Smith GD. Bias in location and selection of studies. BMJ.
28. Roundtree AK, Kallen MA, Lopez-Olivo MA, et al. Poor reporting of search strategy and conflict of interest in over 250 narrative and systematic reviews of two biologic agents in arthritis: A systematic review. J Clin Epidemiol.
29. Moseley AM, Elkins MR, Herbert RD, Maher CG, Sherrington C. Cochrane reviews used more rigorous methods than non-Cochrane reviews: Survey of systematic reviews in physiotherapy. J Clin Epidemiol.
30. Egger M, Smith GD. Principles of and procedures for systematic reviews. In: Egger M, Smith GD, Altman DG, eds. Systematic Reviews in Health Care: Meta-analysis in Context. 2nd ed.
London: BMJ Publishing Group; 2001:23–42.
31. Royle P, Milne R. Literature searching for randomized controlled trials used in Cochrane reviews: Rapid versus exhaustive searches. Int J Technol Assess Health Care
32. LeFebvre C, Clarke MJ. Identifying randomized trials. In: Egger M, Smith GD, Altman DG, eds. Systematic Reviews in Health Care: Meta-analysis in Context. 2nd ed.
London: BMJ Publishing Group; 2001:69–86.
33. Hopewell S, Clarke M. Lefebvre C, Scherer R. Handsearching versus electronic searching to identify reports of randomized trials. Cochrane Database Syst Rev.
34. Moher D, Jadad AR, Tugwell P. Assessing the quality of randomized controlled trials. Current issues and future directions. Int J Technol Assess Health Care
35. Moher D, Jadad AR, Nichol G, Penman M, Tugwell P, Walsh S. Assessing the quality of randomized controlled trials: An annotated bibliography of scales and checklists. Control Clin Trials
36. Jüni P, Witschi A, Bloch R, Egger M. The hazards of scoring the quality of clinical trials for meta-analysis. JAMA.
37. Guyatt GH, Oxman AD, Vist GE, et al. GRADE: An emerging consensus on rating quality of evidence and strength of recommendations. BMJ.
38. Jüni P, Altman DG, Egger M. Assessing the quality of randomized controlled trials. In: Egger M, Smith GD, Altman DG, eds. Systematic Reviews in Health Care: Meta-analysis in Context. 2nd ed.
London: BMJ Publishing Group; 2001:87–108.
39. Margaliot Z, Haase SC, Kotsis SV, Kim HM, Chung KC. A meta-analysis of outcomes of external fixation versus plate osteosynthesis for unstable distal radius fractures. J Hand Surg (Am.)
40. Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: Explanation and elaboration. Ann Intern Med.
41. McAlister FA. Applying the results of systematic reviews at the bedside. In: Egger M, Smith GD, Altman DG, eds. Systematic Reviews in Health Care: Meta-analysis in Context. 2nd ed.
London: BMJ Publishing Group; 2001:373–385.
42. Glasziou P, Guyatt GH, Dans AL, Dans LF, Straus S, Sackett DL. Applying the results of trials and systematic reviews to individual patients. ACP J Club
©2011American Society of Plastic Surgeons
43. Haines T, McKnight L, Duku E, Perry L, Thoma A. The role of systematic reviews in clinical research and practice. Clin Plast Surg.