To the Editor:
In this letter, we introduce CMAverse, an R package that provides a suite of functions for reproducible causal mediation analysis including directed acyclic graph (DAG) visualization, statistical modeling for estimation and inference of causal effects, and sensitivity analyses for unmeasured confounding, measurement error and selection bias.
In many research fields including biomedical sciences, epidemiology and psychology, the use of causal mediation analysis has been increasing rapidly over the past couple of decades. Mediation analysis can help explain causal relationships and inform policy. In addition to examining a direct causal relation between two variables, mediation models assess causal pathways in which one variable (the exposure) causes another variable (the outcome) through an intermediate variable (the mediator). It has been shown that the overall exposure effect can be decomposed into two components1: a natural direct effect and a natural indirect effect. Additionally, in the presence of exposure–mediator interaction, the overall exposure effect can be further decomposed into four effects1: a controlled direct effect, a reference interaction, a mediated interaction and a pure indirect effect.
These decompositions yield valid results under a set of assumptions of no unmeasured confounding. In practice, there might be violations of these such assumptions, which are also untestable. Meanwhile, there might be measurement error and selection bias incurred during data acquisition. Thus, sensitivity analyses for unmeasured confounding, measurement error and selection bias play vital roles in valid mediation analysis.1
To conceptualize causal mediation analysis, CMAverse represents the relationships among variables using a DAG. Subsequently, statistical modeling approaches are applied to estimate and obtain inferences of causal effects. CMAverse allows for the investigation of the two-way and four-way decomposition of the total exposure effect for a single mediator or for multiple mediators. In addition, it supports time varying confounders. The most popular causal mediation analysis approaches to date are implemented. These include the following: the regression-based approach,2,3 the weighting-based approach,3 the inverse-odds-ratio-weighting approach,4 the natural effect model,5 the marginal structural model,6 and the g-formula approach.7 Finally, CMAverse performs sensitivity analyses for unmeasured confounding, measurement error and selection bias to assess the robustness of the modeling results.
We illustrate how CMAverse can be used to conduct causal mediation analysis replicating the study of VanderWeele et al.8 The results are plotted in the Figure. In this example, the exposure is the genetic variant rs8034191 on chromosome 15q25.1; the mediator is the square root of average cigarettes smoked per day; the outcome is lung cancer. There mediation analysis finds evidence of substantial natural direct effect (1.435 on risk ratio scale) and reference interaction (0.362 on excess relative ratio scale). The gene-smoking interaction explains 78.9% of the total effect of the genetic variant, although only 2.69% of the total effect appears to be explained by the indirect effect through smoking behavior. In the presence of severe measurement error, assuming reliability of 50% in the smoking variable, the conclusions of the study would not change. Finally, to explain away the direct effect, the association of the unmeasured confounder with either the mediator or the outcome on risk ratio scale should take a value of at least 1.82.
To the best of our knowledge, CMAverse is the most comprehensive software package for causal mediation analysis to date. It provides a unified framework for causal mediation analysis and increases reproducibility of statistical results. We believe it will simplify the dissemination and application of rigorous methods for the investigation of causal mechanisms across the biomedical and social sciences.
1. VanderWeele TJ. Explanation in Causal Inference: Methods for Mediation and Interaction. Oxford University Press; 2015.
2. Valeri L, Vanderweele TJ. Mediation analysis allowing for exposure-mediator interactions and causal interpretation: theoretical assumptions and implementation with SAS and SPSS macros. Psychol Methods. 2013;18:137–150.
3. VanderWeele TJ, Vansteelandt S. Mediation analysis with multiple mediators. Epidemiol Methods. 2014;2:95–115.
4. Tchetgen Tchetgen EJ. Inverse odds ratio-weighted estimation for causal mediation analysis. Stat Med. 2013;32:4567–4580.
5. Vansteelandt S, Bekaert M, Lange T. Imputation strategies for the estimation of natural direct and indirect effects. Epidemiologic Methods. 2012;1:131–158.
6. VanderWeele TJ, Tchetgen Tchetgen EJ. Mediation analysis with time varying exposures and mediators. J R Stat Soc Series B Stat Methodol. 2017;79:917–938.
7. Robins JM. A new approach to causal inference in mortality studies with a sustained exposure period-Application to control of the healthy worker survivor effect. Mathematical Modelling. 1986;7:1393–1512.
8. VanderWeele TJ, Asomaning K, Tchetgen Tchetgen EJ, et al. Genetic variants on 15q25.1, smoking, and lung cancer: an assessment of mediation and interaction. Am J Epidemiol. 2012;175:1013–1020.