It is now widely recognized that valid mediation analyses require adjustment for mediator–outcome confounding.1,2 Failing to adjust for such confounders does not bias the total effect of the exposure, but the extent of mediation (i.e., the size of the indirect effect through the mediator relative to the direct effect through other pathways) may be over- or underestimated.1–3 However, it is not always the case that researchers collect data on these confounders, particularly when mediation is secondary to the primary exposure–outcome analysis for which a study has been designed and for which confounders have been measured. Here we discuss a straightforward, although approximate, approach to sensitivity analysis for direct and indirect effects that essentially corresponds to the mediational equivalent to the “E-value”4—a metric that was introduced as an approach to quantify robustness to confounding for total effects.
Recently a sensitivity analysis technique was proposed to examine the extent to which unmeasured confounding of the mediator–outcome relationship could explain observed mediation effects.1,2 Specifically, a bound for the bias can be used to assess the possible influence of an unmeasured confounder
on the observed direct or indirect effects (more formally, the natural direct and indirect effects; eAppendix; https://links.lww.com/EDE/B559 for counterfactual definitions). The bound describes the maximum extent to which unmeasured confounding by
could have resulted in the overestimation of the indirect effect and the corresponding underestimation of the direct effect or vice versa.
FIGURE.: Directed acyclic graph representing a mediated effect subject to unmeasured mediator–outcome confounding. A represents the exposure of interest, M the mediator, and Y the outcome. C includes all measured confounders of both the exposure–outcome and mediator–outcome relationships, and U is an unmeasured confounder of the latter. In the example in the text, A is fertility treatment, M multiple gestations, and Y preterm birth. C includes all confounders the authors were able to measure and adjust for, and U the mediator–outcome confounders they were not able to measure.
as well as on the strength of the relationship between
and
that is induced within levels of mediator
, specifically ;)
With these two parameters specified, the maximum ratio by which the estimate of the indirect effect (and correspondingly direct effect), on the RR scale, could differ from the true value was shown to be given by
The interpretation of the parameter
is relatively straightforward as the maximum direct effect, among the exposed, of
on
not through
. However, the interpretation of the second parameter
is more complex. That the
parameter differs from 1 is due to so-called collider bias, which occurs when conditioning on the common effect (
) of two variables (
and
). However, the fact that
and
are not related except within levels of
makes the magnitude of this relationship difficult to intuit or speculate about.
Citing previous work by Greenland,3 the authors proposing this bound claimed that in most but not all situations, the relationship between
and
would be at least as strong as the conditional A–U relationship and could therefore potentially be used as its proxy in sensitivity analysis.1 Specifically, instead of the parameter
, defined as the maximum RR of
on some value of
within a level of
, the authors proposed using the maximum RR comparing two values of
on
within a level of
.
This is a more intuitive parameter to specify as it directly reflects the strength of the confounder–mediator relationship, which is of course necessary for confounding to be present.
The purpose of the present study was to investigate to what extent this alternative parameter could be used to perform approximate instead of exact sensitivity analysis for natural direct and indirect effects.
METHODS
To evaluate the extent to which the bound for unmeasured confounding would hold if
were used instead of
, we conducted an exhaustive numerical search via simulation. Our main analysis searched over all possible probability combinations of binary exposure, mediator, outcome, and unmeasured confounder, without making functional form assumptions. We did this to ensure no restrictions on the size or direction of any interactions or on the prevalence of any of the variables.
Conditional probabilities for
,
, and
were randomly drawn from a uniform (0,1) distribution to ensure that all possible probabilities were encountered. However, because allowing the probabilities to vary over any range results in many unrealistic situations, we also considered several possibly more plausible restrictions when generating the data, including following a log-linear model for the mediator and restricting interactions so that all effects were in the same direction.
After randomly generating the necessary probabilities to define the joint distribution of exposure, mediator, outcome, and confounder, we calculated the true direct effect on both the risk ratio and risk difference (RD) scales, as well as the effects that would be observed if the confounder were unmeasured. From these values, we computed the bias as the ratio of observed to true direct effect on the RR scale and as the difference on the RD scale. We also computed the exact bound as defined above.
Next, we computed the bound using
instead of
. We considered this alternative bound to be successful when it, like the true bound, was greater than the bias. We also “corrected” the observed effects using both the true and alternative bounds by dividing (for RRs) or subtracting (for RDs) the bound from the observed effect. When the alternative bound failed to bound the bias, we computed the magnitude of the bias that would remain if we used it to correct the mediated effects anyway.
Because not only the bias parameters but also the bias itself is conditional on measured confounders that have been adjusted for in the estimation of the observed effects, we did not explicitly include
as one of the random variables in the numerical search. Instead, we assumed, as we do throughout the text, that we are working within strata of
.
More details can be found in section 2 of the eAppendix; https://links.lww.com/EDE/B559. R code can also be found in the eAppendix; https://links.lww.com/EDE/B559.
RESULTS
As expected, the bound computed with the appropriate parameters was always greater than or equal to the bias. We found that in 99.3% of simulation settings, the value calculated using the
parameter did indeed also bound the bias (eTable 1; https://links.lww.com/EDE/B559). However, we did not discover any complete characterization that distinguished the failed bounds from the successes, although they occurred more frequently when the bias was larger (eTable 2; https://links.lww.com/EDE/B559). On average, the bound constructed with the alternative parameter was weaker than the true bound (eFigure 1; https://links.lww.com/EDE/B559), but overall effects corrected by both bounds had relatively similar distributions (eFigure2; https://links.lww.com/EDE/B559). When the alternative bound did fail, the residual bias after correcting with it was generally small. Over 80% of the time, the corrected effect was less than 1.2 times as great as the true effect (eFigure 3; https://links.lww.com/EDE/B559). On the risk difference scale, the remaining bias was less than 0.02 around 50% of the time that the bound failed (eTable 4; https://links.lww.com/EDE/B559).
We found similar results when we varied the data-generating distribution. Although the distributions we assumed for the variables are unlikely to directly reflect reality, the few situations in which the bound failed were unremarkable and did not appear more likely to occur in practice than those in which the bound held. In section 4 of the eAppendix; https://links.lww.com/EDE/B559, we consider continuous
and
as well as derive conditions under which other alternative bounds will hold.
Discussion
These results support the claim that this sensitivity analysis technique can, at least roughly, be based on the proposed strength of the relationship between an unmeasured confounder and the mediator, which better matches intuition about the source and size of confounding. When unmeasured confounding is suspected, bounds can be computed by researchers or readers with values for
that seem reasonable for a given situation in order to see how bias of that magnitude would affect the direct and indirect effects.
It is also possible to calculate the minimum size of the two parameters that would be necessary to completely explain away a direct or indirect effect. This is essentially a mediational analogue to the E-value to assess robustness to unmeasured confounding for total effects.4 For an observed natural direct or indirect effect risk ratio of magnitude
, this is given by1,4
If both of the parameters are at least as large as the mediational E-value, conditional on the measured confounders, it is possible that unmeasured confounding is entirely responsible for a direct or indirect effect, and that effect is truly null. The mediational E-value expression applies exactly to the sensitivity analysis parameters
and
approximately, as above, to the parameters
and
.
As an example, Oberg et al5 examined the indirect effect of fertility treatment on preterm birth mediated through multiple gestations (Figure). They estimated an indirect effect risk ratio of 1.55 (95% confidence interval [CI]: 1.52, 1.59). The mediational E-value for the estimate is 2.47, and for the limit of the CI closest to the null, it is 2.41. We are then able to make statements of the form, “To completely explain away the observed indirect effect, an unmeasured confounder associated with both multiple gestations and preterm birth with approximate risk ratios of 2.47-fold each, above and beyond the measured covariates, could suffice, but weaker confounding could not. To shift the confidence interval to the null, an unmeasured confounder associated with both multiple gestations and preterm birth with approximate risk ratios of 2.41-fold each, above and beyond the measured covariates, could suffice, but weaker confounding could not.”
We hope that by simplifying an already straightforward sensitivity analysis method with a more easily specified parameter, researchers will increasingly assess the robustness of their mediation analyses to unmeasured confounding.
REFERENCES
1. Ding P, VanderWeele TJ. Sharp sensitivity bounds for mediation under unmeasured mediator-outcome confounding. Biometrika. 2016;103:483–490.
2. VanderWeele TJ. Explanation in Causal Inference: Methods for Mediation and Interaction. New York: Oxford University Press; 2015.
3. Greenland S. Quantifying biases in causal models: classical confounding vs collider-stratification bias. Epidemiology. 2003;14:300–306.
4. VanderWeele TJ, Ding P. Sensitivity analysis in observational research: introducing the E-Value. Ann Intern Med. 2017;167:268–274.
5. Oberg AS, VanderWeele TJ, Almqvist C, Hernandez-Diaz S. Pregnancy complications following fertility treatment-disentangling the role of multiple gestation. Int J Epidemiol. 2018;47:1333–1342.