# A Unification of Mediation and Interaction: A 4-Way Decomposition

The overall effect of an exposure on an outcome, in the presence of a mediator with which the exposure may interact, can be decomposed into 4 components: (1) the effect of the exposure in the absence of the mediator, (2) the interactive effect when the mediator is left to what it would be in the absence of exposure, (3) a mediated interaction, and (4) a pure mediated effect. These 4 components, respectively, correspond to the portion of the effect that is due to neither mediation nor interaction, to just interaction (but not mediation), to both mediation and interaction, and to just mediation (but not interaction). This 4-way decomposition unites methods that attribute effects to interactions and methods that assess mediation. Certain combinations of these 4 components correspond to measures for mediation, whereas other combinations correspond to measures of interaction previously proposed in the literature. Prior decompositions in the literature are in essence special cases of this 4-way decomposition. The 4-way decomposition can be carried out using standard statistical models, and software is provided to estimate each of the 4 components. The 4-way decomposition provides maximum insight into how much of an effect is mediated, how much is due to interaction, how much is due to both mediation and interaction together, and how much is due to neither.

From the Departments of Epidemiology and Biostatistics, Harvard School of Public Health, Boston, MA.

Submitted 19 September 2013; accepted 28 January 2014; posted 3 July 2014.

T.J.V.W. was supported by National Institutes of Health grant ES017876.

Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article (www.epidem.com). This content is not peer-reviewed or copy-edited; it is the sole responsibility of the author.

Correspondence: Tyler J. VanderWeele, Departments of Epidemiology and Biostatistics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115. E-mail: tvanderw@hsph.harvard.edu.

- Abstract
- NOTATION
- A 4-FOLD DECOMPOSITION
- IDENTIFICATION OF THE EFFECTS
- RELATION TO STATISTICAL MODELS
- BINARY OUTCOMES AND THE RATIO SCALE
- ILLUSTRATION
- RELATION TO MEDIATION DECOMPOSITIONS
- RELATION TO INTERACTION DECOMPOSITIONS
- DISCUSSION
- ACKNOWLEDGMENTS
- REFERENCES
- APPENDIX
- Supplemental Digital Content

Methodology for assessing mediation and interaction has developed rapidly over the past decade. Methods for effect decomposition to assess direct and indirect effects have shed light on mechanisms and pathways.^{1–19} Other methods and measures have been useful in assessing how much of the effect of one exposure is due to its interaction with another.^{20–24} This article provides theory and methods to unite these effect decomposition and attribution methods for mediation and interaction. The central result of this article is that the total effect of an exposure on an outcome, in the presence of a mediator with which the exposure may interact, can be decomposed into components due to just mediation, to just interaction, to both mediation and interaction, and to neither mediation nor interaction.

After presenting this 4-way decomposition, this article discusses assumptions for identifying these 4 components from data and relates this 4-way decomposition approach to various statistical models. The article discusses the relationships between existing measures of mediation and interaction and each of the 4 components and shows how existing measures of mediation and interaction consist of different combinations of these 4 components. Different effect decomposition and attribution approaches for mediation and interaction can be united within this 4-fold framework. When some of the components are combined, the framework presented here essentially collapses into approaches used previously. The greatest insight, however, is arguably gained when the 4-fold approach is used as illustrated with an example from genetic epidemiology.

## NOTATION

Let *A* denote the exposure of interest, *Y* the outcome, and *M* a potential mediator, and let *C* denote a set of baseline covariates. Suppose we want to compare 2 levels of the exposure, *a* and *a**; for binary exposure, we would have *a* = 1 and *a** = 0. For simplicity, consider the setting of a binary exposure and binary mediator; more general results that are applicable to arbitrary exposures and mediators are given in the Appendix. Let *Y*_{a} and *M*_{a} denote, respectively, the potentially counterfactual values of the outcome and mediator that would have been observed had the exposure *A* been set to level *a*. The total effect (TE ) of the exposure *A* on the outcome *Y* is defined as *Y*_{1} − *Y*_{0}; the total effect of the exposure *A* on the mediator *M* is defined as *M*_{1} − *M*_{0}. We will not, in general, ever know what these effects are at the individual level, but we might hope to be able to estimate them on average for a population. For the first part of this article, however, will be concerned with concepts and will later address what can be identified with data and under what assumptions.

Counterfactuals of another form will also be needed. Let *Y*_{am} denote the value of the outcome that would have been observed had *A* been set to level *a* and *M* to *m*. The controlled direct effect, comparing exposure level *A* = 1 to *A* = 0 and fixing the mediator to level *m*, is defined by *Y*_{1m} − *Y*_{0m} and captures the effect of exposure *A* on outcome *Y*, intervening to fix *M* to *m*; it may be different for different levels of *m*.^{1},^{2} It may also be different for different persons. Finally, we will also later consider counterfactuals of the form Y^{aMa*}, which is the outcome *Y* that would have occurred if *A* were fixed to *a*, and if *M* were fixed to the level it would have taken if *A* had been *a**. Some technical assumptions referred to as consistency and composition are also needed to relate the observed data to counterfactual quantities. The consistency assumption in this context is that when *A* = *a*, the counterfactual outcomes *Y*_{a} and *M*_{a} are equal to the observed outcomes *Y* and *M*, respectively, and that when *A* = *a* and *M* = *m*, the counterfactual outcome *Y*_{am} is equal to *Y*. The composition assumption is that Y^{a} =Y^{aMa}. Further discussion on these assumptions is given elsewhere.^{4},^{18},^{25}

## A 4-FOLD DECOMPOSITION

In the Appendix, it is shown that a total effect (TE) of *A* on *Y* can be decomposed into the following 4 components:

The first component, (*Y*_{10} − *Y*_{00}), is the direct effect of the exposure *A* if the mediator were removed, ie, fixed to *M* = 0. This effect is sometimes referred to as a “controlled direct effect” (CDE).^{1},^{2} The second component, (*Y*_{11} − *Y*_{10} − *Y*_{01} + *Y*_{00}) (*M*_{0}), will be referred to as a “reference interaction” (INT_{ref}). The term (*Y*_{11} − *Y*_{10} − *Y*_{01} + *Y*_{00}) is an additive interaction. It can be rewritten as (*Y*_{11} − *Y*_{00}) − {(*Y*_{10} − *Y*_{00}) + (*Y*_{01} − *Y*_{00})} and will be non-zero for a person if the effect on the outcome of setting both the exposure and the mediator to present differs from the sum of the effect of having only the exposure present and the effect of having only the mediator present; such additive interaction is generally considered of greatest public health importance.^{20–22} The second component in the decomposition in (1) is the product of this additive interaction and *M*_{0}. Thus, this second component, (*Y*_{11} − *Y*_{10} − *Y*_{01} + *Y*_{00}) (*M*_{0}), is an additive interaction that operates only if the mediator is present in the absence of exposure, ie, when *M*_{0} = 1. The third component, (*Y*_{11} − *Y*_{10} − *Y*_{01} + *Y*_{00}) (*M*_{1} − *M*_{0}), will be referred to as a “mediated interaction” (INT_{med}). It is the same additive interaction contrast multiplied by (*M*_{1} − *M*_{0}). In other words, it is an additive interaction that operates only if the exposure has an effect on the mediator so that *M*_{1} − *M*_{0} ≠ 0. The final component, (*Y*_{01} − *Y*_{00}) (*M*_{1} − *M*_{0}), is the effect of the mediator in the absence of the exposure, *Y*_{01} − *Y*_{00}, multiplied by the effect of the exposure on the mediator itself, *M*_{1} − *M*_{0}. It will be non-zero only if the mediator affects the outcome when the exposure is absent, and the exposure itself affects the mediator. This final component could be referred to as a “mediated main effect” or (as explained below) a “pure indirect effect” (PIE).^{1},^{2}

The intuition behind this decomposition is that if the exposure affects the outcome for a particular individual, then at least 1 of 4 things must be the case. One possibility is that the exposure might affect the outcome through pathways that do not require the mediator (ie, the exposure affects the outcome even when the mediator is absent); in other words, the first component is non-zero. A second possibility is that the exposure effect might operate only in the presence of the mediator (ie, there is an interaction), with the exposure itself not necessary for the mediator to be present (ie, the mediator itself would be present in the absence of the exposure, although the mediator is itself necessary for the exposure to have an effect on the outcome); in other words, the second component is non-zero. A third possibility is that the exposure effect might operate only in the presence of the mediator (ie, there is an interaction), with the exposure itself needed for the mediator to be present (ie, the exposure causes the mediator, and the presence of the mediator is itself necessary for the exposure to have an effect on the outcome); in other words, the third component in non-zero. The fourth possibility is that the mediator can cause the outcome in the absence of the exposure, but the exposure is necessary for the mediator itself to be present; in other words, the fourth component is non-zero. The decomposition above, proved in the Appendix, provides a mathematical formalization of this intuition. Having introduced the four components, we can write our decomposition as:

As with the total effect of the exposure on the outcome, *Y*_{1} − *Y*_{0}, it is not in general possible to know the value of each of the 4 components for a particular individual, but (as discussed below) there are assumptions under which measures of these 4 components on average can be estimated for a particular population. As shown below, there are certain assumptions about confounding under which the average value of each of 4 components is given by the following empirical expressions:

where *p*_{am} = *E*(*Y*|*A* = *a, M* = *m*). Letting *p*_{a} = *E*(*Y*|*A* = *a*) produces the following empirical decomposition:

With such average measures, it is possible to assess how much of the total effect is due to neither mediation nor interaction (the first component); how much is due to interaction but not mediation (the second component); how much is due to both mediation and interaction (the third component); and how much of the effect is due to mediation but not interaction (the fourth component). The 4 components of the total effect are summarized in Table 1. Let *E*[TE] denote the average total effect for the population (equal to *p*_{a=1} − *p*_{a=0} = *E*(*Y*|*A* = 1) − *E*(*Y*|*A* = 0) in the absence of confounding), then the proportion of the total effect that is due to each of these 4 components can be expressed using the ratios

, and

. We could also assess the overall proportion due to mediation by summing the proportions due to the mediated interaction and to the pure indirect effect, ie,

. Similarly, the overall proportion due to interaction can be assessed by summing the proportions due to the reference interaction and to the mediated interaction, ie,

. Reporting such proportion measures, however, generally makes sense only if all the components are in the same direction (eg, all positive or all negative). The statistical properties of such proportion measures can also be highly variable (and hence problematic) if the total effect is close to zero, as might be the case if some of the components were positive and others negative. Similar comments pertain to other proportion measures described below.

The following section considers the no-confounding assumptions that allow estimation of these 4 components on average, and the sections after that consider the statistical methods to carry out such estimation. Later sections consider the relationships between this 4-fold decomposition and other concepts from the literatures on mediation and interaction that involve effect decomposition and attribution.

## IDENTIFICATION OF THE EFFECTS

The discussion thus far has been primarily conceptual. As noted, the individual-level effects in the 4-way decomposition cannot be identified from the data, but under certain no-confounding assumptions the 4 components can be identified from the data on average for a population. As discussed further in the Appendix, for a causal diagram interpreted as nonparametric structural equation models of Pearl,^{18} the following 4 assumptions suffice to identify each of the 4 components from the data: (1) the effect the exposure *A* on the outcome *Y* is unconfounded conditional on *C*; (2) the effect the mediator *M* on the outcome *Y* is unconfounded conditional on (*C, A*); (3) the effect the exposure *A* on the mediator *M* is unconfounded conditional on *C*; and (4) none of the mediator-outcome confounders are themselves affected by the exposure. These are the same 4 assumptions often used in the literature on mediation.^{2},^{4},^{5} Letting X ┴┴ Y | Z denote that *X* is independent of *Y* conditional on *Z*, then these 4 assumptions stated formally in terms of counterfactual independence are: (1) *Y*_{am} ┴┴ *A*|*C*, (2) *Y*_{am} ┴┴ *M*|{*A, C*}, (3) *M*_{a} ┴┴ *A*|*C*, and (4) *Y*_{am} ┴┴ *M*_{a*}|*C*. Note that assumption (4) requires that none of the mediator-outcome confounders are themselves affected by the exposure. This assumption would hold in Figure 1 but would be violated in Figure 2. If these 4 assumptions held without covariates, then we would have the empirical formulae given above:

More general formulae involving covariates and with arbitrary exposures and mediator (rather than binary) are given in the Appendix.

The counterfactual statement of assumption (4), *Y*_{am} ┴┴ *M*_{a*}|*C*, is somewhat controversial as it involves what are sometimes called “cross-world” independencies. It would hold in Figure 1 interpreted as a nonparametric structural equation model^{18} but may not hold under other interpretations of causal diagrams.^{19} As noted above, the empirical equivalent of the 4-way decomposition was (2). As shown in the Appendix, this decomposition holds without any assumptions about confounding. However, to interpret each of the components causally does require assumptions about confounding. Assumptions (1)–(4) above allow each of the components to be interpreted as population-average causal effects of each of the 4 components in the 4-way individual-level counterfactual decomposition: CDE, INT_{ref}, INT_{med}, and PIE. The Appendix discusses how a slightly weaker interpretation is also valid under just assumptions (1)–(3) alone, without requiring the more controversial assumptions (4).

Also of interest is the fact that the controlled direct effect, CDE, requires only assumptions (1) and (2) to be identified.^{1},^{2} This does not require the more controversial cross-world independence assumptions. The average controlled direct effect is sometimes subtracted from the average total effect to get a portion-eliminated measure *E*[PE]:= *E*[TE] − *E*[CDE ]. Whenever the total effect and the controlled direct effect can be identified, it is possible to calculate this portion-eliminated measure. Interestingly, as described further below, the 4-way decomposition gives a more mechanistic interpretation of this portion-eliminated measure: the portion eliminated is the sum of the reference interaction, the mediated interaction, and the pure indirect effect (PE = INT_{ref} + INT_{med} + PIE ), ie, it is the portion due to either mediation or interaction or both. These 3 components cannot be empirically separated without using stronger assumptions such as (1)–(4) above. However, whenever the total effect and the controlled direct effect can be identified (which can be done under much weaker assumptions), it is possible also to obtain the sum of the 3 other components since they are simply the difference between the total effect and the controlled direct effect.

## RELATION TO STATISTICAL MODELS

Suppose that assumptions (1)–(4) hold, that *Y* and *M* are continuous, and that the following regression models for *Y* and *M* are correctly specified:

It is shown in the eAppendix (http://links.lww.com/EDE/A797) that for exposure levels *a* and *a**, and setting the mediator to 0 in the controlled direct effect (see the eAppendix, http://links.lww.com/EDE/A797, for other settings of mediator for the CDE), the 4 components are given by:

If the exposure were binary, the pure direct, reference interaction, mediated interaction, and pure indirect effects would, respectively, simply be: θ_{1},θ_{3}(β_{0} + β’_{2})},θ_{3}β_{1}, and *θ2β*_{1}. Standard errors for estimators of these quantities could be derived using the delta method along the lines of VanderWeele and Vansteelandt^{4} or by using bootstrapping. SAS code to implement this approach to obtain estimates and confidence intervals is provided in the eAppendix (http://links.lww.com/EDE/A797). The eAppendix (http://links.lww.com/EDE/A797) likewise provides a straightforward modeling approach (with SAS code) when the mediator is binary rather than continuous.

## BINARY OUTCOMES AND THE RATIO SCALE

The definitions of these 4 components has, thus far, been considered on a difference scale. Often in epidemiology, risk ratios or odds ratios are used for convenience or ease of interpretation or to account for study design. By dividing the decomposition in (2) by *p*_{a = 0}, we can rewrite this decomposition on the ratio scale as

where

is the relative risk for exposure *A* comparing *A* = 1 to the reference category *A* = 0, and

is the relative risk for comparing categories *A* = *a, M* = *m* to the reference category *A* = 0, *M* = 0, and where *k* is a scaling factor that is given by

. Note also that the term, (RR_{11} − RR_{10} − RR_{01} + 1), is Rothman’s excess relative risk due to interaction (RERI ) and is a measure of additive interaction using ratios.^{20}

The decomposition in (2) involves decomposing the excess relative risk for the exposure *A*, RR_{a=1} - 1, into 4 components on the excess relative risk scale involving, as before, (1) the controlled direct effect of *A* when *M* = 0, (2) a reference interaction, (3) a mediated interaction, and (4) a mediated main effect. Although the right-hand side of the decomposition involves a scaling factor, if what we are interested in is the proportion of the effect attributable to each of the components, then if we take out any particular component and divide it by the sum of all the components, the scaling factor drops out. The proportion of the effect attributable to each of the 4 components is thus given by the expressions in Table 2.

The 4-fold proportion-attributable measures given in Table 2 allow us to estimate the proportion of the total effect attributable only to mediation (PA_{PIE}), due only to interaction (PA_{INTref}), due to both mediation and interaction (PA_{INTmed}), or due to neither mediation nor interaction (PA_{CDE}). The eAppendix (http://links.lww.com/EDE/A797) provides further technical details concerning the 4-way decomposition on the ratio scale and for obtaining estimates and confidence intervals using logistic regression for the outcome, along with linear regression for a continuous mediator or a second logistic regression for a binary mediator. SAS code to implement this approach is also given in the eAppendix (http://links.lww.com/EDE/A797).

## ILLUSTRATION

The concepts and methods described above for the 4-way decomposition are illustrated here with example from genetic epidemiology. Specifically, we consider the extent to which the effect of chromosome 15q25.1 rs8034191 C alleles on lung cancer risk is due to mediation by, or interaction with, cigarettes smoked per day. rs8034191 C alleles are known to be associated with both smoking^{26},^{27} and lung cancer,^{28–30} but there had been debate as to whether the effects on lung cancer were direct or mediated by smoking. VanderWeele et al^{31} used methods from the causal mediation analysis literature to assess whether the effect was direct or indirect and found that most of the effect was not mediated by cigarettes per day (the indirect effect was very small and the direct effect was large). In large meta-analyses, Truong et al^{32} found no association between the genetic variants among never-smokers, suggesting strong interaction between the variants and smoking behavior; VanderWeele et al^{31} likewise reported statistical evidence of interaction. Here, we will use the 4-way decomposition to assess how much of the effect is due to each of the components. Note that cigarettes per day is used as the mediator here, and in prior analyses; cigarettes per day may not capture all aspects of smoking (eg, depth of inhalation).

Data are from 1836 cases and 1452 controls from a lung cancer case-control study at Massachusetts General Hospital.^{31},^{33} As the exposure we compare 2 versus 0 *C* alleles and use cigarettes per day as the mediator (the square root of this measure is used to make the measure more normally distributed). Covariates adjusted for in the analysis include sex, age, education, and smoking duration. Analyses are restricted to Caucasians. Because the outcome (lung cancer) is rare, odds ratios approximate risk ratios. A logistic regression model is fit for lung cancer on the variants, smoking, their interaction, and the covariates; and a linear regression model for smoking is fit on the variants and covariates. Confidence intervals are obtained using the delta method. Details of this modeling approach in the context of the 4-way decomposition are given in the eAppendix (http://links.lww.com/EDE/A797); SAS code is also provided. Results are summarized in Table 3. The overall risk ratio comparing 2 versus 0 *C* alleles was 1.77 (95% confidence interval [CI] = 1.33 to 2.21) for an excess relative risk of 1.77 − 1 = 0.77 (0.33 to 1.21). This excess relative risk decomposes into the 4 components. The component due to the pure indirect effect is 0.014 (−0.01 to 0.04); the component due to the mediated interaction is 0.034 (−0.02 to 0.09); the component due to the reference interaction is 0.42 (0.11 to 0.73); and the component due to the controlled direct effect (if smoking were fixed to 0) is 0.30 (−0.19 to 0.79). The 4 components sum to the excess relative risk: 0.014 + 0.034 + 0.42 + 0.30 ≈ 0.77. Of these 4 components, the reference interaction is most substantial, highlighting the important role of interaction in this context. The overall proportion mediated (the sum of the pure indirect effect and the mediated interaction, divided by the excess relative risk) is quite small at 6% (−3% to 15%), as had been indicated in the analyses of VanderWeele et al.^{31} The overall proportion attributable to interaction (the reference interaction plus the mediated interaction, divided by the excess relative risk) is relatively substantial, 59% (9% to 109%). Mediation may play a role here (and probably does, as the variants do affect smoking and smoking affects lung cancer), but interaction, between the variants and smoking, is clearly much more important.

## RELATION TO MEDIATION DECOMPOSITIONS

This section discusses the relations between the 4 components above and concepts from the mediation analysis literature, and the next section discusses the relations with the interaction-analysis literature. As above, the 4-fold decomposition is:

The first component, (*Y*_{10} − *Y*_{00}), is referred to in the mediation analysis literature as a controlled direct effect (CDE ) of the exposure when fixing the mediator to level *M* = 0. Other controlled direct effects can also be considered, which set *M* to a level other than 0; the Discussion section and Appendix consider 4-way decompositions involving these alternative controlled direct effects. The fourth component in the 4-way decomposition, (*Y*_{01} − *Y*_{00})(*M*_{1} − *M*_{0}), which was referred to above as a “mediated main effect” is equivalent to what in the mediation analysis literature is sometimes referred to as a pure indirect effect (PIE ). It is shown in the Appendix that:

The counterfactual contrast, Y_{0M1} — Y_{0M0}, in the mediation analysis literature is referred to as a “pure indirect effect”^{1} or as a type of “natural indirect effect.”^{2} This contrast Y_{0M1} — Y_{0M0} compares what would happen to the outcome if the mediator were changed from the level *M*_{0} (the level it would be in the absence of the exposure) to *M*_{1} (the level it would be in the presence of exposure) while in both counterfactual scenarios fixing the exposure itself to be absent. This pure indirect effect will be non-zero if and only if the exposure changes the mediator (so that *M*_{0} and *M*_{1} are different) and the mediator itself has an effect on the outcome even in the absence of the exposure. However, this is the same quantity as in our decomposition above, namely (*Y*_{01} − *Y*_{00})(*M*_{1} − *M*_{0}). Note that writing the pure indirect effect as (*Y*_{01} − *Y*_{00})(*M*_{1} − *M*_{0}) gives a representation of the pure indirect effect that does not require nested counterfactuals of the form *Y*_{0M1}. This may be of interest as sometimes objections are made to the pure indirect effect on the grounds that nested counterfactuals of the form *Y*_{0M1} are difficult to interpret. The third component, (*Y*_{11} − *Y*_{10} − *Y*_{01} + *Y*_{00})(*M*_{1} − *M*_{0}), was recently considered in the mediation analysis literature and called a mediated interaction (INT_{med}).^{34} As discussed in the Appendix and elsewhere,^{34} this mediated interaction can also be written as Y_{1M1} — Y_{0M1} — Y_{1M0} + Y_{0M0}. The component not yet considered is the second component, (*Y*_{11} − *Y*_{10} − *Y*_{01} + *Y*_{00})(*M*_{0}), referred to above as a reference interaction (INT_{ref} ), for which there is no analog in the current literature. However, as shown in the Appendix, the sum of the first and second component does have an analog in the mediation analysis literature—it is equal to what is sometimes called the “pure direct effect” (PDE ), defined as *Y*_{1M0} − *Y*_{0M0}. This compares what would happen to the outcome in the presence versus the absence of the exposure if, in both cases, the mediator were set to whatever it would be for that individual in the absence of exposure. Expressed algebraically, we have that

The pure direct effect is the sum of a controlled direct effect (our first component) and the reference interaction (our second component). If in the 4-way decomposition above the first 2 components are replaced with the pure direct effect, and the fourth component is written as the pure indirect effect, then we obtain:

In other words, the total effect can be decomposed into a pure direct effect, a pure indirect effect, and a mediated interaction. This decomposition was the 3-way decomposition provided by VanderWeele^{34} in 2013. Before this, a 2-way decomposition was the norm in the mediation analysis literature. As discussed in the Appendix and in VanderWeele,^{34} the sum of the mediated interaction and the pure indirect effect is equal to what in the mediation analysis literature is sometimes called a “total indirect effect” (TIE ), defined as Y_{1M1} — Y_{1M0}. Whereas the pure indirect effect, Y_{0M1} — Y_{0M0}, compares changing the mediator from *M*_{0} to *M*_{1} while fixing the exposure itself to be absent, the total indirect effect, Y_{1M1} — Y_{1M0}, compares changing the mediator from *M*_{0} to *M*_{1} fixing the exposure to present. With the total indirect so defined, we have TIE = PIE + INT_{med}, ie, Y_{1M1} — Y_{1M0} = Y_{0M1} — Y_{0M0} + (Y_{11} – Y_{10} – Y_{01} + Y_{00})(M–M_{0}). It is possible then to combine the mediated interaction and the pure indirect effect in the decomposition in (3) into a total indirect effect to obtain the more standard 2-way decomposition in the mediation analysis literature:

This is the decomposition used most often in the causal-inference literature when assessing direct and indirect effects; this 2-way decomposition was first proposed in 1992 by Robins and Greenland^{1}; and it is the decomposition on which most of the software packages for causal mediation analysis have focused.^{8},^{16} This 2-way decomposition also provides the counterfactual formalization for the decompositions typically used in the social-science literature on mediation.^{35} However, as seen above, the pure direct effect is itself a combination of 2 components: a controlled direct effect and the reference interaction (*PDE = CDE + INTref*); and the total indirect effect is a combination of 2 components, the pure indirect effect and the mediated interaction (*TIE = PIE + INTmed*). When these effects are estimated on average, a proportion-mediated measure, E[TIE]/E[TE], is sometimes used; this can also be rewritten as

.

Yet, another decomposition is worth noting in the mediation analysis literature. Sometimes the mediated interaction in the decomposition in (3) is combined with the pure direct effect, rather than with the pure indirect effect, for an alternative 2-way decomposition. As discussed in the Appendix and in VanderWeele,^{34} the sum of the mediated interaction and the pure direct effect is equal to what is sometimes called a “total direct effect” (TDE),^{1} defined as Y_{1M1} — Y_{0M1}. The total and the pure direct effects are sometimes also called “natural direct effects,”^{2} and the total and the pure indirect effects are sometimes called “natural indirect effects.”^{2} A summary of the various composite effects is given in Table 4.

Of interest here is that the total direct effect contains 3 components: the controlled direct effect, the reference interaction, and the mediated interaction. Moving from the first to the third of these components, the components involve the mediator in increasingly more substantial ways. The controlled direct effect, (*Y*_{10} − *Y*_{00}), operates completely independent of the mediator; for this to be non-zero, the direct effect must be present even when the mediator is absent. The reference interaction, (*Y*_{11} − *Y*_{10} − *Y*_{01} + *Y*_{00})(*M*_{0}), requires the mediator to operate, but the effect does not come about by the exposure changing the mediator—it simply requires that the mediator is present even when the exposure is absent; the effect is “unmediated,” in the sense that it does not operate by the exposure changing the mediator, but it requires the presence of the mediator nonetheless. The third component, the mediated interaction, (*Y*_{11} − *Y*_{10} − *Y*_{01} + *Y*_{00})(*M*_{1} − *M*_{0}), is a type of mediated effect; it requires that the exposure change the mediator, but it is also a direct effect insofar as an interaction must also be present (the effect of the exposure is different for different levels of the mediator); the third component thus not only involves the mediator, but it is a mediated effect and a direct effect as well. For this reason, this component is sometimes combined with the pure indirect effect to obtain the total indirect effect and sometimes combined with the pure direct effect to obtain the total direct effect.

Combining the pure direct effect and mediated interaction to get the total direct effect, TDE := Y_{1M1} — Y_{0M1} = PDE INT_{med}, gives an alternative 2-way decomposition of the total effect into the sum of the total direct effect and the pure indirect effect:

This decomposition was likewise proposed by Robins and Greenland^{1} in 1992. Relatively easy-to-use software is available to estimate the components of the 2-way decompositions in Equations (4) and (5) on average for a population, under the assumptions described earlier in the article. Note that in the decomposition in (5), the total direct effect consists of 3 of the 4 basic components (the controlled direct effect, the reference interaction, and the mediated interaction), whereas the pure indirect effect constitutes a single component. The mediated interaction is, however, arguably part of the effect that is mediated and thus, when questions of mediation are of interest, it is arguably (4) rather than (5), that is to be preferred when assessing the extent of mediation.^{9},^{10},^{34} However, whether the pure indirect effect or the total indirect effect is of interest may depend on the context.^{6}

A final measure used in the mediation analysis literature is sometimes referred to as the “portion eliminated” (PE ).^{1},^{12} As noted above, this is generally defined as the difference between the total effect and the controlled direct effect: PE:= (*Y*_{1} − *Y*_{0}) − CDE. This is the portion of the effect of the exposure that would be eliminated if the mediator were fixed to 0. The portion eliminated may be of interest insofar as it allows one to assess how much of the effect of the exposure can be eliminated or prevented by intervening on the mediator; for this reason, it is sometimes regarded as of policy interest.^{1},^{6},^{12} The 4-way decomposition, above, shows that this portion-eliminated measure is equal to the sum of the other 3 components: the reference interaction, the mediated interaction, and the pure indirect effect, ie, PE = INT_{ref} + INT_{med} + PIE; the total effect can then be written as TE = CDE + PE. The 4-way decomposition provides a causal interpretation for the difference between the total effect and the controlled direct effect: it is the portion of the effect attributable to mediation, interaction, or both. When the portion eliminated is estimated at the population level, sometimes a proportion-eliminated measure is also calculated as

which could also be rewritten as

. This is different from the proportion-mediated measure considered earlier, which was

. The proportion eliminated includes in the numerator the reference interaction (since this part of the effect is eliminated if the mediator is removed); the proportion mediated does not include the reference interaction in the numerator (since this is not part of the mediated effect).^{12}

There are thus a number of possible decompositions. However, for addressing questions of mediation, there is no need to choose between the 2-way decompositions or even the 3-way decomposition; the 4-way decomposition provides 4 components capturing all the subtleties: the portion of the total effect that is attributable just to mediation, just to interaction, to both mediation and interaction, or to neither mediation nor interaction. The various decompositions within the context of mediation are summarized in Table 5, but the 4-way decomposition here essentially provides a framework that encompasses them all.

## RELATION TO INTERACTION DECOMPOSITIONS

VanderWeele and Tchetgen Tchetgen^{24} recently considered attributing a portion of the total effect of one exposure on an outcome that is due to an interaction with a second exposure. This is related to the 4-way decomposition as follows. The 4-way decomposition above was expressed as:

which can also be written as: TE = CDE + INT_{ref} + INT_{med} + PIE. Suppose now that instead of considering how much of the total effect is mediated versus direct, as in the previous section, we were interested in the portion due to interaction. In the 4-way decomposition, 2 of the 4 components (the second and the third) involve an interaction. The portion attributable to interaction could then be defined as their sum: PAI:= INT_{ref} + INT_{med} = (*Y*_{11} − *Y*_{10} − *Y*_{01} + *Y*_{00})(M_{0}) + (*Y*_{11} − *Y*_{10} − *Y*_{01} + *Y*_{00})(*M*_{1} − *M*_{0}) = (*Y*_{11} − *Y*_{10} − *Y*_{01} + *Y*_{00})(*M*_{1}), resulting in the following 3-way decomposition:

The total effect can be decomposed into the effect of *A* with *M* absent (CDE ), a pure indirect effect (PIE ), and a portion attributable to interaction (PAI ). Consider now the empirical analog of this decomposition using the expressions in (2). Let *p*_{am} = *E*[*Y*|*A* = *a, M* = *m*], *p*_{a} = *E*[*Y*|*A* = *a*], and *p*_{m} = *E*[*Y*|*M* = *m*]. It follows from (2) that:

This again decomposes the average total effect of *A* into what is essentially the average controlled direct effect, the average portion attributable to interaction, and the average pure indirect effect. The middle component is the component due to interaction, and the proportion of the effect due to interaction could then be assessed by: (*p*_{11} − *p*_{10} − *p*_{01} + *p*_{00})*P*(*M* = 1|*A* = 1)/(*p*_{a = 1} − *p*_{a = 0}).

The decomposition given above in (7) is that which VanderWeele and Tchetgen Tchetgen^{24} used when attributing effects to interactions. Several points are worth noting. First, the decomposition in (6) and (7) for the portion attributable to interaction follows from the 4-way decomposition. The decomposition in (6) is the decomposition at the individual counterfactual level analogous to the empirical decomposition in (7) given by VanderWeele and Tchetgen Tchetgen.^{24} Second, VanderWeele and Tchetgen Tchetgen considered 2 cases, one in which *A* and *M* were statistically independent in distribution and another in which they are not. The decomposition in (7) was that which was proposed when *A* affected *M*. When *A* and *M* are independent, the decomposition in (7) reduces to (*p*_{a = 1} − *p*_{a = 0}) = (*p*_{10} − *p*_{00}) + (*p*_{11} − *p*_{10} − *p*_{01} + *p*_{00})*P*(*M* = 1), with a similar decomposition for the total effect of *M* on *Y*: (*p*_{m = 1} − *p*_{m = 0}) = (*p*_{01} − *p*_{00}) + (*p*_{11} − *p*_{10} − *p*_{01} + *p*_{00})*P*(*A* = 1). Similarly, on a ratio scale, when *A* does not affect *M*, the third and fourth components in Table 2 become 0, leaving

and

, which are also the expressions given by VanderWeele and Tchetgen Tchetgen^{24} for attributing effects to interactions on a ratio scale. When *A* affects *M*, the decomposition for the total effect of *A* on *Y* is altered, and we must use the decomposition in (7). Finally, when *A* does not affect *Y*, there is an analogous individual counterfactual-level decomposition, as (6) then reduces to: TE = (*Y*_{10} − *Y*_{00}) + (*Y*_{11} − *Y*_{10} − *Y*_{01} + *Y*_{00})(*M*) since when *A* does not affect *M, M*_{1} = *M*_{0} = *M*. All this also follows from the 4-way decomposition, which encompasses all the prior decompositions. These decompositions are summarized in Table 6.

In the more general setting, when *A* affects *M*, it is possible to estimate the portion due to interaction on the average level using the 3-way decomposition in (7). But there is no need to use only a 3-way decomposition; we can instead use the 4-way decomposition in (1) and the empirical expressions in (2) to further divide the portion due to interaction into that which is due to interaction but not mediation (the reference interaction, *E*[INT_{ref}]) and the portion due to interaction and mediation (the mediated interaction *E*[INT_{med}]). Such a 4-way decomposition, in which the portion attributable to interaction is itself further divided, may give additional insight.

Perhaps most importantly, this 4-way decomposition, which helps better understand the portions of a total effect due to interaction, is exactly the same decomposition that was used above to shed insight into what portions of the total effect were mediated and which portions were direct. The same 4-way decomposition was useful in assessing both mediation and interaction. The same 4 components are used in assessing mediation and interaction, but the components are combined in different ways to assess these different phenomena. The 4-way decomposition itself provides a unification of mediation and interaction. The 4-fold decomposition underlies the various more specific decompositions in assessing both mediation and interaction. As illustrated in Figure 3, the 4 components form the backbone of both the various mediation decompositions (Figures 3–5) and the interaction decomposition (Figure 3). Once again, however, the greatest insight is gained when the 4-fold approach is used to assess simultaneously the portions of the total effect that are due only to mediation, only to interaction, to both mediation and interaction, and to neither mediation nor interaction.

## DISCUSSION

A 4-way decomposition has been given here that encompasses and unites previous decompositions in the literature, both concerning mediation and concerning interaction. The results provide a mechanistic interpretation of the difference between a total effect and a controlled direct effect; this contrast has been used to assess policy implications, and it is more easily identified than many other causal quantities concerning mediation; the results here show that it also has a mechanistic interpretation as well. The 4-way decomposition can be carried out on a difference scale and on a ratio scale; its components can be related to standard statistical models; and software code is provided in the eAppendix (http://links.lww.com/EDE/A797) to estimate the various components of the decomposition using regression models. In addition to reporting the 4 components, an investigator can easily report the overall proportion attributable to interaction, the overall proportion mediated, and the proportion of the effect that would be eliminated if the mediator were removed. As seen in the empirical example in genetic epidemiology, this approach can shed considerable insight into the relationships with an exposure and a mediator with an outcome and into the role of both mediation and interaction in these relationships.

The focus here has been on a binary exposure and binary mediator, with the controlled direct effect being that in which the mediator is fixed to be absent. More general results are given in the Appendix, and the approach in fact applies to arbitrary exposures and mediators. Moreover, instead of focusing on a controlled direct effect with the mediator absent, one can consider controlled direct effects that fix the mediator to some other level, *m**. Similar 4-way decompositions can be carried out, wherein the first component is the controlled direct effect with the mediator fixed to level *m**. When this is done, the reference interaction term changes because, with the mediator fixed to *m** (rather than 0), the controlled direct effect picks up some of the effect of the interaction between the exposure and the mediator. With a controlled direct effect in which the mediator is fixed to *m**, the interpretation of the reference interaction is then the portion of the effect due to the interaction between the exposure and the mediator that is not mediated and also not captured by the controlled direct effect. Again, the results in the Appendix cover more general settings and will thus likely be of use in a variety of contexts. The code in the eAppendix (http://links.lww.com/EDE/A797) likewise provides practical and relatively easy-to-use software tools to implement the approaches here in a wide range of settings.

The central limitations of the approach developed here is the strong assumptions about confounding and absence of measurement error. These assumptions are, however, the same assumptions as those made in the literature on mediation that focuses only on simpler decompositions. Future research could examine the robustness of each of the 4 components to confounding and measurement error. For example, recent work indicates that interaction terms may be more robust to confounding,^{36} but that interaction terms when the 2 exposures are correlated may be particularly sensitive to measurement error^{37},^{38}; different components may be robust to different forms of bias. Future work could also extend existing sensitivity analysis techniques for mediation and interaction^{7},^{8},^{36},^{38} to each of the 4 components.

Prior work on mediation within the counterfactual framework has accommodated potential interaction. The present approach clarifies the role of interaction, and its separate contribution beyond mediation, and unites, within a single framework, the phenomena of mediation and interaction.

## ACKNOWLEDGMENTS

*The author thanks the reviewers for helpful comments.*

## REFERENCES

## APPENDIX

In the Appendix, we will no longer restrict attention to binary exposure and mediator and will consider an arbitrary exposure and mediator. We will assume we are comparing 2 exposure levels *a* and *a**. We give the general 4-way decomposition result in Proposition 1.

*Proposition 1*. For any level *m** of *M* we have Y_{a} – Y_{a*}

*Proof*. We have that Y_{a} – Y_{a*}

This completes the proof.

The 4 components of the decomposition in general form are thus

Note we can also rewrite

and we can rewrite

. Doing so with binary A and M and setting a=1,a_{*} =0, m_{*} = 0 gives us the decomposition in (1) in the text:

The decomposition also has an empirical analog given in the next Proposition.

*Proposition 2*. For any level *m** of *M*, we have

*Proof*. We have that E[Y | a, c] – E[Y | a*, c]

This completes the proof.

Note we can also rewrite the third term as

and the fourth term as

Doing so with binary A and M, and setting a = 1, a* = 0,m* = 0 gives decomposition (2) in the text:

Note the decomposition in Proposition 2 is a property of the expectations and probabilities. It does not require confounding assumptions. However, to interpret the components as causal effects, confounding assumptions are required. We will begin our discussion of confounding by first considering nonparametric structural equations.^{18} Consider the following 4 confounding assumptions: (1) the effect the exposure *A* on the outcome *Y* is unconfounded conditional on *C*; (2) the effect the mediator *M* on the outcome *Y* is unconfounded conditional on (*C, A*); (3) the effect the exposure *A* on the mediator *M* is unconfounded conditional on *C*; and (4) none of the mediator-outcome confounders are themselves affected by the exposure. If we let X ┴┴ Y | Z denote that *X* is independent of *Y* conditional on *Z*, then these 4 assumptions stated formally in terms of counterfactual independence are: (1) *Yam* ┴┴ *A* | *C*, (2) *Yam* ┴┴ *M* | *A, Yam* ┴┴ *A* | {*A C*} (3) *Ma* ┴┴ *A* | *C*, and (4) *Yam* ┴┴ *Ma** | *C*.

*Proposition 3.* Under assumptions (1)–(4) we have:

*Proof*. The first equality is established by Robins,^{39} the fourth by Pearl,^{2} and the third by VanderWeele.^{34} For the second equality, we have *E [INTref (m*) | C ]*

where the second equality follows by assumption (4) and the fourth by assumptions (1)–(3). In fact, the other 3 equalities in Proposition 3 can be established in much the same way. This completes the proof.

We can also interpret the terms in the decomposition in Proposition 2 causally under assumptions (1)–(3) alone, although the causal interpretation is slightly weaker, as shown in the next proposition.

*Proposition 4*. Under assumptions (1)–(3) we have:

*Proof*. The first equality is established by Robins,^{39} the second in the final 4 lines of the proof of Proportion 3 above, the third in VanderWeele,^{34} and the fourth, using slightly different notation, by Didelez et al.^{40} This completes the proof.

Note we can also rewrite the right side of the third equality as

and the right side of the fourth equality as

. The right side of the equalities in Proposition 4 are causal quantities, but rather than directly taking population averages of the 4 components of the decomposition, the effects of *A* and *M* on *Y* are integrated over the distribution of *M* under different exposure settings. As discussed further in the eAppendix, these effects can be interpreted as randomized interventional analogs of the 4 components of the decomposition. They require only assumptions (1)–(3) for identification (ie, they do not require the more controversial cross-world independence assumption (4)), but the causal interpretation of these randomized interventional analogs is somewhat weaker.

Finally, we note that some earlier literature on interaction with binary exposures made use of different response types where individuals were classified according to their joint counterfactual outcomes (Y_{00}, Y_{01}, Y_{10}, Y_{11}). Under the response-type classification for mediation given by Hafeman and VanderWeele^{41} (in which it is assumed that the monotonicity assumption that *Yam* is nondecreasing in *a* and in *m* holds), the 4 components of the 4-decomposition could be written as *CDE*(0) = *Y*_{(2)} + *Y*_{(4)}, *INTref* (0) = *M*_{(1)} (*Y*_{(8)} – *Y*_{(2)}), *INTref* (0) = *M*_{(2)} (*Y*_{(8)} – *Y*_{(2)}), and *PIE* = *M*_{(2)} (*Y*_{(2)} – *Y*_{(6)}) where *M*_{(1)} is a binary indicator such that *M*_{(1)} = 1 if *M*_{0} = *M*_{1} = 1 and *M*_{(1)} = 0 otherwise; *M*_{(2)} is a binary indicator such that *M*_{(2)} = 0 if *M*_{0} = 0, *M*_{1} = 1, and *M*_{2} = 0, otherwise; *Y*_{(2)} is a binary indicator such that *Y*_{(2)} = 1 if (*Y00* = 0, *Y01* = 1, *Y11* = 1) and *Y*_{(2)} otherwise; *Y*_{(4)} is a binary indicator such that *Y*_{(4)} = 1 if *Y00* = 0, *Y01* = 1, *Y10* = 0, *Y11* = 1, and *Y(6)* = 0, otherwise; *Y(8)* is a binary indicator such that *Y(6)* = 1 if (*Y*_{(8)} = 1 if *Y00* = 0, *Y01* = 1, *Y10* = 1, *Y11* = 1) and *Y*_{(8)} = 0 otherwise.