Systematic reviews are a method to synthesize the existent evidence on a certain topic. A systematic review extends beyond the subjective, narrative reporting characteristics of a traditional literature review by employing procedures to rigorously extract data from studies that have been included following assessment of their quality, and to synthesize, or combine, that data when appropriate.1 The systematic review of the literature on a particular condition, intervention or issue is seen as core to defining reliable evidence for practice. Systematic reviews aim to provide comprehensive and unbiased summary synthesis of the evidence on a single topic by bringing together multiple individual studies in a single document research process.
When establishing the association or relationship between an exposure and health outcome, association studies are required. These studies are sometimes also referred to as correlational studies as they aim to summarize associations between variables, but are unable to make direct inferences about etiology and effect, as there are too many unknown variables that could potentially influence the data. Despite this admission, which must be acknowledged, studies addressing associations are conducted specifically to identify factors related to the investigated outcome. The best available evidence must be evaluated to determine if there is a valid association between an exposure and an outcome. The evidence then needs to be reviewed to determine whether anything confounding explains the association. The question of whether an association is causal or not arises in case a valid association is seen. However, it should be noted that not all associations are causal.
The systematic review of etiological studies is important in informing healthcare planning and resource allocation. Synthesis of evidence related to association data for the purposes of this study refers to studies that address etiology.
To address issues regarding etiology, epidemiological or observational studies are required. Observational studies can be used to infer correlations between two variables, for example, between a variable and a disease outcome. These designs address questions such as: What is the cause of the disease? What factors are associated with the disease? This kind of information is particularly valuable for governments when making decisions regarding health policy. Data from observational studies can therefore be useful in enabling the formation of hypotheses regarding risk or preventive factors in disease development and progression.
Questions of etiology and risk factors commonly arise in relation to public health leading to an increase in publication of systematic reviews. Currently, there is a lack of formal guidance to inform the conduct of systematic reviews of these types of data. In 2014, a working group was formed within the Joanna Briggs Institute (JBI) to evaluate systematic reviews of association data. The aim of this article is to describe the development of the JBI approach and guidance to conduct systematic reviews of association that address etiological issues.
The methodology working group was formed to investigate and develop methodology for the conduct of systematic reviews of association data. The working group comprised researchers from the JBI in Adelaide, Australia, and members of the Joanna Briggs Collaboration (JBC) from the USA, Canada, Australia, Romania and Taiwan, all experienced in a variety of systematic review methodologies. The working group met monthly to discuss, define and develop methods for reviews of these types of data. In November 2014, the methodology was presented in a workshop during the JBI Colloquium in Singapore, providing international colleagues a forum and an opportunity to critique and, in turn, provide some feedback. At every stage of developing the methodology, feedback was sought, considered and incorporated following discussion and consensus agreement between members of the working group. There was some further internal (within JBI) and external feedback following the development of guidance, which helped in focussing the guidance for systematic reviews addressing etiological issues.
Etiology systematic review methodology
The process of conducting a systematic review is a scientific exercise, and as the results will influence healthcare decisions, it is required to have the same rigour expected of all research. The quality of a review, and its recommendations, depends on the extent to which scientific review methods are followed to minimize the risk of error and bias. The explicit and rigorous methods of the process distinguish systematic reviews from traditional reviews of the literature. Even when research evidence is limited or non-existent, systematic reviews summarize the best available evidence on a specific topic providing the best evidence for clinical decision-making and policymakers, as well as identifying future research needs.
The systematic review of studies to answer questions of etiology still follows the same basic principles of systematic review of other types of data. An a priori protocol must inform the conduct of the systematic review, comprehensive searching must be performed and critical appraisal of retrieved studies must be carried out followed by data synthesis.
Forming a review question/objective
The overarching objectives of reviews of etiology are to determine whether and to what degree a relationship exists between two or more quantifiable variables. Accordingly, the review question should outline the exposure, disease, symptom or health condition of interest, the population or groups at risk and the context/location (which may include any contextual factors such as geographical, temporal or cultural elements relevant to the topic), or the time period (e.g. peaks at a particular season) and the length of time (e.g. the duration of a pregnancy or infancy) when relevant.
An example of an objective for a systematic review of association data is as follows: The objective of this review is to assess the association between consumption of alcohol and lung cancer. Here, the ‘assessment’ of the association will include the measures of association indicating the strength or magnitude and direction of the association. The two may be positively associated or the relationship may be negative, for example, as one increases, the other decreases.
Another example is as follows: Are children exposed to tobacco smoke (maternal smoking) during pregnancy at risk for obesity in childhood? In this example, the duration of exposure is also included in the question because the objective is to study the long-term effects of tobacco smoke exposure in utero on the developing foetus.
Review question examples are as follows:
- Does the evidence support a likely causal effect of consumption of alcohol on lung cancer?
- What is the effect of second-hand cigarette smoke exposure versus no second-hand cigarette smoke on the exacerbation of dyspnoea in middle-aged to older adults with chronic obstructive pulmonary disease?
Providing a standardized framework for the inclusion criteria helps to ensure that the plan for the review is methodologically sound and transparent to all members of the review. It provides a map of where you are going, starting from the title and the review question. It permits the synthesis team the knowledge of what you are looking for along the way by identifying the inclusion criteria, which then influences the search strategy. The inclusion criteria are also critical when formulating a comprehensive search strategy that will seek studies of relevance to the review.
The traditional PICO (population, intervention, comparator and outcomes) format for systematic reviews of effects does not align with questions relating to etiology. A systematic review of etiology should include the following aspects:
- Population (types of participants)
- Exposure of interest (independent variable)
- Outcome or response (dependent variable)
The types of participants should be appropriate for the review objectives. The reasons for the inclusion or exclusion of participants should be explained in the background. The inclusion and exclusion criteria need to reflect sound clinical and scientific reasoning and the need for an adequate degree of homogeneity amongst the samples in the studies.
Exposure of interest (independent variable)
This refers to the many factors associated with disease/condition of interest in a population, group or cohort who have been exposed to them. Like the parameters which define population, the inclusion criteria related to variables will determine scope and some degree of homogeneity in the studies. For example, when assessing risk, the review question will frequently refer to the exposure rather than an intervention.
Outcome or response (dependent variable)
The outcome or response results from changes to an independent variable. The review protocol must specify the important outcomes of interest relevant to the health issue and important to key stakeholders like the knowledge users, consumers, policymakers, payers and the like. Generally, reviewers should avoid surrogate outcomes as they may mislead. On the contrary, sometimes direct measurements may not be possible, but bring additional caution to the interpretation. Reviewers should consider when and how the outcome may be measured and, in addition, determine if the review should examine secondary or mediating outcomes.
Types of studies
Reviews of association (etiology) are predominantly derived from observational studies. These include retrospective, prospective, cross-sectional, longitudinal, case–control and cohort studies. Randomized control trials (RCTs) may also report on the risk associated with an intervention and can be included. However, the decision to include the types of studies depends on outcome measurement (correlation) or type of analysis (multivariate analysis to address confounders). Epidemiological observational studies of etiology relate individual characteristics, personal behaviours, environmental conditions and treatments as ‘exposures’ that may modify the risk of disease. Prospective cohort studies usually provide stronger evidence than case–control studies when addressing etiological questions or issues.
Critical appraisal is a process conducted in systematic reviews to establish the internal validity and risk of bias of studies that meet the review inclusion criteria. The JBI has a number of tools already developed for assessing the quality of various quantitative study designs,2 and these tools were deemed appropriate to use when assessing questions of association and correlation with some modifications. Therefore, the existing JBI study design specific tools were re-developed and are in the process of being submitted to the JBI Scientific Committee (a panel of expert clinicians and researchers from various JBI international collaborating centres) for approval. Another critical appraisal tool, known as QUIPS (Quality In Prognosis Studies), may be used to assess the risk of bias in studies of prognostic factors, when conducting a systematic review of association that addresses prognostic issues.
The issue of including poor-quality studies versus excluding these studies
The authors of the review have to state a priori in the protocol the criteria used to determine the inclusion or exclusion of poor-quality studies. The authors have to make explicit and agree on criteria to determine whether a study is of good, moderate or poor quality and, on the basis of these criteria or a combination of criteria, the authors can decide whether to include only good-quality studies or all studies irrespective of the quality. However, the importance of these criteria (e.g. selection, measurement bias, confounding) will vary with study type and problems specific to the review question.
Data synthesis and meta-analysis
As with all systematic reviews, there are various approaches to presenting the results, including a narrative, graphical or tabular summary, or meta-analysis.1 When meta-analysis is not possible, a set of alternative methods for synthesizing research is available. On the basis of the research question and objectives, narrative, tabular and/or visual approaches can be used for data synthesis.
A meta-analysis is a statistical procedure which combines the findings from multiple primary studies into a single overall summary estimate. A meta-analysis can be conducted to improve statistical power to detect a treatment effect, to estimate a summary average effect, to identify sub-groups associated with a beneficial effect and to explore differences in the size or direction of the treatment effect associated with study-specific variables.3 Meta-analysis of association studies addressing etiological issues may rarely be possible because of the differences in the factors controlled for in multivariable analyses, and also because of poor reporting in the original studies with lack of adequate details.
Meta-analysis is only appropriate when studies are sufficiently homogenous from a clinical and methodological point of view. If studies are heterogeneous from a clinical (i.e. population, outcome) or methodological (i.e. study design) point of view, then it is uncertain if it is appropriate to synthesize the respective studies into meta-analysis. It is suggested that the decision to conduct meta-analysis should not be just based on statistical considerations regarding heterogeneity, but should be based on the review question, the characteristics of the studies and the interpretability of the results.4 When used in relation to meta-analysis, the term ‘heterogeneity’ refers to the amount of variation in characteristics of included studies.
While some variation between studies will always occur due to chance alone, heterogeneity is said to occur if there are significant differences between studies, and under these circumstances, meta-analysis is not valid and should not be undertaken. Visual inspection of the meta-analysis output, for example, forest plot, is the first stage of assessing heterogeneity, followed by tests such as the Cochran Q or I2.2 If there is statistically significant heterogeneity, a narrative synthesis or graphical representation is recommended.
Special considerations relating to questions of risk
When data are combined directly from adjusted relative risk (RR) estimates as described here, the most common statistical approaches to meta-analysis are: in a fixed-effects model, the inverse variance method for log RR and the Mantel–Haenszel method; and in a random-effects model, the DerSimonian and Laird Method.
These statistical methods are commonly associated with the random-effects model of meta-analysis. In the random-effects model, the underlying assumption is that variability in the data arises from variability within study samples (i.e. between the patients) and from the differences between the studies also. In a random-effects model, results apply beyond the included studies. This is the primary distinction between the random and fixed-effects models of meta-analysis, the latter of which, commonly used when applied to meta-analysis of RCTs, considers only within-study variation rather than the between-study variation. As the random-effects model effectively incorporates the method for estimating unexplained variation in the analysis, this model is most frequently applied to ‘compensate’ for the heterogeneity apparent in observational studies. When there is no heterogeneity present, the results of fixed and random-effects models will be similar.
Frequently published primary studies investigating the risk of an exposure will design the study and present the available data at different levels of the exposure, or in different categories to reflect a ‘dose–response’ relationship between the exposure and outcome variable. Difficulties will naturally arise if different studies have used different exposure categories and have presented these data in a variety of different ways.
Causality cannot be established by reporting an association between an exposure or a factor of interest, and an outcome or symptom. Epidemiologist Bradford Hill and others in 1965 proposed certain aspects of evidence later used as criteria that should be considered when trying to draw conclusions about causality, which would probably prevent misleading causal associations.5 A protocol may be developed that lays the foundations for a systematic review that will apply the Bradford Hill criteria for causality. These criteria include (based on his landmark epidemiologic studies that were important in establishing the strong association between smoking and lung cancer) the following: strength of the association, consistency, specificity, temporality, biological gradient, plausibility/coherence, experiment and analogy.5
Special considerations for correlation data
Studies of correlation assess the linear relationship between two continuous variables using a coefficient of correlation, which is a unit-free dimensionless quantity ranging between −1 and +1. A 0 would indicate no linear association, whereas the positive and negative values indicate a direct or an inverse relationship, respectively. The Hedges–Olkin method is used for calculating the weighted summary correlation coefficient under the fixed-effects model, using a Fisher's Z transformation of the correlation coefficients.6 Following this, the heterogeneity statistic is incorporated to calculate the summary correlation coefficient under the random-effects model.
When correlation coefficients are used as the effect-size measure, Hedges and Olkin,6 and Rosenthal and Rubin7 both advocate converting these effect sizes into a standard normal metric (using Fisher's r-to-Z transformation), and then calculating a weighted average of these transformed scores. There is an equation for Fisher's r-to-Z transformation (and the conversion back to r).6 The first step is to use this equation to convert each correlation coefficient into its corresponding Z value. The transformed effect sizes are then used to calculate an average in which each effect size is weighted.8
In this model, weights are calculated using a variance component that incorporates between-study variance in addition to the within-study variance used in the fixed-effect model. This between-study variance is denoted by t2 and is simply added to the within-study variance. The DerSimonian and Laird method is the preferred method under random-effects model to calculate the summary correlation coefficient effect size.
Sub-group analysis (analysis of sub-groups or subsets)
Sub-group analysis is a means of investigating results and can be used to estimate the influence of various subsets including age group, sex, types of population, outcome measurement and sampling strategy used to gather data (e.g. letter, phone, face-to-face). Sub-group analysis can also be used to explore heterogeneity. However, sub-groups should be pre-specified a priori and should be few.
Sub-group analysis could include the following:
- Subsets of studies
- Subsets of patient groups
- Subsets of correlates
The narrative synthesis of data
Narrative synthesis is an approach which relies primarily on the use of words and text to summarize and explain the findings of a synthesis process. Its form may vary from the simple recounting and description of study characteristics, context, quality and findings to a more interpretive and reflexive approach that includes higher levels of abstraction. Lucas et al.9 presented two different methods of narrative data synthesis: textual description of studies (individual or group of studies) and the thematic analysis methods.
- Textual descriptions of individual studies: Summaries of individual studies can be structured to provide details of the setting, participants, intervention, comparison and outcomes, along with any other factors of interest (e.g. the income level of the users, age of users).
- Textual descriptions of groups of studies: On the basis of relevant criteria (e.g. types of participants), included studies can be sub-grouped. Subsequently, commentaries summarizing key aspects of the studies in relation to the sub-group within which they were included are produced. In a final step, the scope, differences and similarities among studies are used to draw conclusions across the studies.
- Thematic synthesis: This method holds the most potential for hypothesis generation, but may not clearly express nor present heterogeneity and quality appraisal.
If a narrative synthesis is undertaken to describe the included studies and their conclusions, it is important to discern how the evidence was weighted and whether conclusions were biased. It is recommended that the characteristics of the studies and the data extracted are emphasized, and tables, graphs and other diagrams are made use of to compare data.10 The narrative summary will present quantitative data extracted from individual studies as well as, when available, point estimates (a value that represents a best estimate of effects) and interval estimates (an estimated range of effects, presented as a 95% confidence interval). Because a potentially large amount of data can be conveyed in a narrative summary, consistency can be ensured in the results section if all reviewers agree beforehand on a structure for the reporting of results. If a structure is not followed, the report of results may appear incomplete or unreliable.10 However, if included studies do not provide the relevant information to comply with a structure, it should be made clear in the summary. A textual combination of data is often used when the included studies are dissimilar in terms of patients, methods or data.
The tabular synthesis of data
Tabulating the data begins with grouping the studies in discrete categories (e.g. based on types of participants, interventions, outcomes, country of origin, duration and provider of the intervention, number of participants in each group, context, results and comments); tables can also be developed for analysing various components. When the analysis of the tables reveals the presence of dominant groups or clusters of characteristics, groups of studies can be formed by which the subsequent synthesis can be organized. This technique is particularly useful when there are a larger number of studies. On the basis of the type of data reported, a common results rubric may also be tabulated (e.g. absolute difference, relative risk, odds ratio); this approach can serve as a first step in comparing the effects observed across the included studies.
Suggested steps for tabulating information from studies included in a systematic review are listed below11:
- Place features related to populations, interventions and outcomes in columns.
- Consider what sub-groups of populations there are among included studies.
- Consider what sub-types of variables there are.
- Consider the outcomes and their importance.
- Consider if studies need to be sub-classified according to study designs and quality.
- Populate the cells in the table with information from studies along rows in sub-groups.Sort studies according to a feature that helps to understand their results (e.g. a characteristic of a population or intervention, rank order of quality, year of publication, etc.).
These are useful for the understanding of large datasets, different levels of detail and comparison of the data, and can be closely integrated with accompanying descriptions of the data.12 The benefit is that they are good for identifying patterns, viewing the relationship of parts to whole and for cross-linking; however, the limits are that care is needed with isolation of concepts from their context and strength of evidence.
- Funnel plots: To examine the relationship between study sample size/variance and effect size, these can be constructed by plotting relative risk against standard error.
- Harvest plot: A method for synthesizing evidence about the differential effects of population-level interventions; this may be particularly useful for systematic reviews addressing a broader research question.13
To conclude, analyses over a range of doses will also often be based on different categories in different settings. It is suggested that when presenting results in narrative synthesis, it is best to begin by considering studies according to study design. This includes evaluation of the prospective studies as a group and then comparing the results to those reported from the retrospective or case–control studies, and from any RCTs. This enables combination of data within each study design type as a first step to data summarization.14 Sensitivity analyses may be reported in a systematic review by producing a summary table. A need for sensitivity analysis may arise as a result of various decisions made during the systematic review process, some of which include: characteristics of participants, characteristics of the outcomes and study designs.15
In order to evaluate the quality of completed synthesis, whether a meta-analysis or a narrative synthesis, the authors of the review have to consider various factors such as presentation of sufficient details of the individual studies and an appropriate summary of the primary studies.
Until recently, the focus of systematic reviews was more on the effectiveness of interventions or practices on social and health outcomes. However, increasingly, decisions made in healthcare require more information than can be provided by a simple question ‘does this work?’. As a result, methods and guidance for conducting reviews of various forms of evidence, including qualitative research, cost data, diagnostics, prognostics, prevalence and incidence, exist.2,11 An overall summary provided here by the JBI appears to be the first of its kind for reviews of association addressing etiological issues.
Systematic review and meta-analysis of studies related to etiology is an emerging methodology in the field of evidence synthesis. These reviews can provide useful information for healthcare professionals and policymakers on the burden of disease. The standardized JBI approach offers a rigorous and transparent method to conduct reviews of etiology.
The lead author would like to thank all the co-authors for their valuable input in development of this systematic review methodology.