Secondary Logo

Journal Logo


CMAverse: A Suite of Functions for Reproducible Causal Mediation Analyses

Shi, Baoyi; Choirat, Christine; Coull, Brent A.; VanderWeele, Tyler J.; Valeri, Linda

Author Information
doi: 10.1097/EDE.0000000000001378
  • Free

To the Editor:

In this letter, we introduce CMAverse, an R package that provides a suite of functions for reproducible causal mediation analysis including directed acyclic graph (DAG) visualization, statistical modeling for estimation and inference of causal effects, and sensitivity analyses for unmeasured confounding, measurement error and selection bias.

In many research fields including biomedical sciences, epidemiology and psychology, the use of causal mediation analysis has been increasing rapidly over the past couple of decades. Mediation analysis can help explain causal relationships and inform policy. In addition to examining a direct causal relation between two variables, mediation models assess causal pathways in which one variable (the exposure) causes another variable (the outcome) through an intermediate variable (the mediator). It has been shown that the overall exposure effect can be decomposed into two components1: a natural direct effect and a natural indirect effect. Additionally, in the presence of exposure–mediator interaction, the overall exposure effect can be further decomposed into four effects1: a controlled direct effect, a reference interaction, a mediated interaction and a pure indirect effect.

These decompositions yield valid results under a set of assumptions of no unmeasured confounding. In practice, there might be violations of these such assumptions, which are also untestable. Meanwhile, there might be measurement error and selection bias incurred during data acquisition. Thus, sensitivity analyses for unmeasured confounding, measurement error and selection bias play vital roles in valid mediation analysis.1

To conceptualize causal mediation analysis, CMAverse represents the relationships among variables using a DAG. Subsequently, statistical modeling approaches are applied to estimate and obtain inferences of causal effects. CMAverse allows for the investigation of the two-way and four-way decomposition of the total exposure effect for a single mediator or for multiple mediators. In addition, it supports time varying confounders. The most popular causal mediation analysis approaches to date are implemented. These include the following: the regression-based approach,2,3 the weighting-based approach,3 the inverse-odds-ratio-weighting approach,4 the natural effect model,5 the marginal structural model,6 and the g-formula approach.7 Finally, CMAverse performs sensitivity analyses for unmeasured confounding, measurement error and selection bias to assess the robustness of the modeling results.

We illustrate how CMAverse can be used to conduct causal mediation analysis replicating the study of VanderWeele et al.8 The results are plotted in the Figure. In this example, the exposure is the genetic variant rs8034191 on chromosome 15q25.1; the mediator is the square root of average cigarettes smoked per day; the outcome is lung cancer. There mediation analysis finds evidence of substantial natural direct effect (1.435 on risk ratio scale) and reference interaction (0.362 on excess relative ratio scale). The gene-smoking interaction explains 78.9% of the total effect of the genetic variant, although only 2.69% of the total effect appears to be explained by the indirect effect through smoking behavior. In the presence of severe measurement error, assuming reliability of 50% in the smoking variable, the conclusions of the study would not change. Finally, to explain away the direct effect, the association of the unmeasured confounder with either the mediator or the outcome on risk ratio scale should take a value of at least 1.82.

On the left panel are effect estimates with 95% confidence intervals (in blue) and sensitivity analysis results for 50% measurement error of the smoking measure (in red); on the right penal are sensitivity analysis results for unmeasured confounding. Rcde, controlled direct effect risk ratio; Rpnde, pure natural direct effect risk ratio; Rtnde, total natural direct effect risk ratio; Rpnie, pure natural indirect effect risk ratio; Rtnie, total natural indirect effect risk ratio; Rte, total effect risk ratio; ERcde, excess risk ratio due to controlled direct effect; ERintref, excess risk ratio due to reference interaction; ERintmed, excess risk ratio due to mediated interaction; ERpnie, excess risk ratio due to pure natural indirect effect; ERcde (prop), proportion of ERcde; ERintref (prop), proportion of ERintref; ERintmed (prop), proportion of ERintmed; ERpnie (prop), proportion of ERpnie; pm, proportion mediated; int, proportion attributable to interaction; pe, proportion eliminated.

To the best of our knowledge, CMAverse is the most comprehensive software package for causal mediation analysis to date. It provides a unified framework for causal mediation analysis and increases reproducibility of statistical results. We believe it will simplify the dissemination and application of rigorous methods for the investigation of causal mechanisms across the biomedical and social sciences.


1. VanderWeele TJ. Explanation in Causal Inference: Methods for Mediation and Interaction. Oxford University Press; 2015.
2. Valeri L, Vanderweele TJ. Mediation analysis allowing for exposure-mediator interactions and causal interpretation: theoretical assumptions and implementation with SAS and SPSS macros. Psychol Methods. 2013;18:137–150.
3. VanderWeele TJ, Vansteelandt S. Mediation analysis with multiple mediators. Epidemiol Methods. 2014;2:95–115.
4. Tchetgen Tchetgen EJ. Inverse odds ratio-weighted estimation for causal mediation analysis. Stat Med. 2013;32:4567–4580.
5. Vansteelandt S, Bekaert M, Lange T. Imputation strategies for the estimation of natural direct and indirect effects. Epidemiologic Methods. 2012;1:131–158.
6. VanderWeele TJ, Tchetgen Tchetgen EJ. Mediation analysis with time varying exposures and mediators. J R Stat Soc Series B Stat Methodol. 2017;79:917–938.
7. Robins JM. A new approach to causal inference in mortality studies with a sustained exposure period-Application to control of the healthy worker survivor effect. Mathematical Modelling. 1986;7:1393–1512.
8. VanderWeele TJ, Asomaning K, Tchetgen Tchetgen EJ, et al. Genetic variants on 15q25.1, smoking, and lung cancer: an assessment of mediation and interaction. Am J Epidemiol. 2012;175:1013–1020.
Copyright © 2021 Wolters Kluwer Health, Inc. All rights reserved.