The assessment of mediation is important for testing research hypotheses in epidemiology. Often an investigator is interested not only in the total effect of exposure on a given outcome, but also in the effect mediated through a particular pathway. For example, one might be interested in whether an observed relationship between elevated body mass index (BMI) and breast cancer is explained by increased estrogen levels (Fig. 1).^{1} Demonstration of such a pathway increases the biologic plausibility of the observed relationship, because it provides a test of a specified mechanism.^{2}

One way to evaluate the mechanisms underlying an observed relationship is to use methods to distinguish between direct and indirect effects. From the above example, we might wish to identify the effect of BMI on breast cancer risk that is mediated by increases in estrogen; this is the indirect effect. Alternatively, we might want to quantify the effect of BMI on breast cancer that is not due to the pathway through estrogen; this is the direct effect.

Two general categories of direct and indirect effects have been described in the literature: “natural” and “controlled” effects.^{2–8} In the absence of statistical interaction between exposure and mediator (ie, perfect additivity), the “natural” and “controlled” effects are identical; however, in the presence of interaction, these effects differ. The “natural” direct and indirect effects are partitioned based on naturally occurring values of the intermediate variable; these effects are therefore considered descriptive.^{5} There are 2 subdivisions of the natural effects, referred to as “pure” and “total,” which differ in terms of how interaction between the exposure and mediator is divided.^{2,3} The natural effects are the pure direct effect, the total direct effect, the pure indirect effect, and the total indirect effect.

In contrast, the “controlled” effects quantify what would happen if the investigator conducted an experiment that fixed the mediator at a certain value^{3–5,8}; these effects are thus considered prescriptive.^{5} In the setting of a dichotomous mediator, there are 2 controlled direct effects. The “blocked” direct effect is the effect of exposure that would be observed if the mediator were eliminated from the entire population (M = 0); this is abbreviated as the CDE(m=0) (controlled direct effect when setting M = 0). The “assigned” direct effect is the effect of exposure if everyone in the population were given the mediator (M = 1); this is abbreviated as CDE(m=1) (controlled direct effect when setting M = 1).

If there is no interaction between the effects of exposure and mediator on the outcome (ie, perfect additivity) then we can also define a blocked indirect effect, CIE(m=0), and an assigned indirect effect, CIE(m=1), by subtracting controlled direct effects, either blocked or assigned, from the total effect. Kaufman et al^{9} have shown that, if there is interaction between the exposure and the mediator, this approach of subtracting a controlled direct effect from a total effect does not generally give a quantity that can be interpreted as an indirect effect. However, under certain monotonicity assumptions, the difference between a total effect and a controlled direct effect can be interpreted as the portion of the total effect that would be eliminated by intervening to fix the mediator to a particular value.^{3}

It has been shown that the identification of direct and indirect effects, controlled or natural, requires assumptions beyond those necessary for the identification of the total effect of exposure on disease.^{3–8,10,11} A common cause of the mediator and the outcome, if not adjusted for properly, can lead to inaccurate assessment of direct and indirect effects from the observed data; such a variable will not bias the assessment of the total effect.^{11} Thus, the valid estimation of direct and indirect effects depends on the general assumption that there is no uncontrolled confounding of the mediator-disease relationship.

The exact form of this assumption differs across the described direct and indirect effects.^{3–5,9} Each direct and indirect effect has slightly different exchangeability requirements, with regard to which exposure- and mediator-defined subgroups must be exchangeable and the degree of exchangeability (full vs. partial) that is necessary. Assumptions have been previously articulated for both the controlled effects^{3} and natural effects.^{4,5} We build on this work by proposing an alternative, and in some cases less stringent, set of assumptions for the unbiased assessment of these quantities.

The remainder of this paper is organized as follows. We first review the relevant definitions concerning mediation, using both potential outcomes and response types. Next, we review the standard estimators of each direct and indirect effect; the quantities represented by these estimators are stated both in terms of conditional probabilities and in terms of the contributing response types (conditional on the observed subsets of the population). We then examine the assumptions that suffice to identify direct and indirect effects. This procedure is applied to each direct and indirect effect, controlled and natural. To facilitate comparison with prior literature, we translate our assumptions for response types into potential outcomes. The distinctions between current and previous assumptions are described. Finally, we discuss some of the implications of our assumptions for confounders of the mediator-outcome relationship.

## POTENTIAL OUTCOMES AND RESPONSE TYPES FOR MEDIATION

We apply an approach based on response types to derive assumptions for the identification of the natural and controlled effects. Response types have been useful for understanding confounding,^{12} interaction,^{13–15} and direct and indirect effects.^{3,7,9} Here we show that they provide further insight into the assumptions necessary for the valid estimation of direct and indirect effects. This method yields new assumptions for natural and controlled effects that we compare with other assumptions in the literature.

To focus on biases that are specific to mediation analysis (and do not bias the total effect estimate), we assume that there is no confounding of the exposure-mediator or exposure-disease relationship. Specifically, we assume full exchangeability of response types between the exposed and unexposed. In addition, we assume that all variables (exposure, mediator, and disease) are perfectly measured. To make use of response types we furthermore assume that all variables of interest (exposure [X], mediator [M], and disease [Y]) are dichotomous.

Finally, we make the simplifying assumption that effects are monotonic; that is, a given exposure cannot cause and prevent a given outcome at the individual level. This assumption is justifiable in many settings; for example, it is difficult to imagine that a harmful substance (eg, cigarette smoking) would both cause lung cancer in some individuals and prevent it in others. Although there are clearly situations where exposure would have both beneficial and harmful effects (eg, a drug with side-effects), the assumption of monotonicity is commonly made when developing response type frameworks for mediation.^{3,9} These simplifying assumptions will facilitate our analysis. However, in the Appendix of this paper we generalize the results to allow for nonbinary exposure and outcome, without the assumption of monotonicity.

The current approach is based on the observation that mediation is a 2-stage process.^{2,7} In the M-stage, the exposure (X) causes the mediator (M); in the Y-stage, the mediator (M), and exposure (X) cause disease (Y). Each stage has a separate set of potential outcomes and response types. Like prior work on response types and causal effects,^{3,9,12} we assume that each individual is characterized by a single response type at a given point in time; in this way, our model is deterministic.

There are 2 relevant potential outcomes for the M-stage: M_{1} (the value of the mediator, either 0 or 1, when setting X = 1) and M_{0} (the value of the mediator, either 0 or 1, when setting X = 0). Each person has a set of values for the above potential outcomes (eg, M_{1} = 1 and M_{0} = 1), that correspond to his or her response type for this stage (M-type, denoted by M^{T}) (Table 1). To label the M-types, we adopt the response types described by Rothman and Greenland^{12,16} for a simple exposure-disease relationship, excluding the preventive type based on the assumption of monotonicity. This yields the following definition for the variable M^{T}: doomed (M^{T} = 1), X-causal (M^{T} = 2), and immune (M^{T} = 4). See Table 1 for the relationship between the M-stage response types and potential outcomes; note M^{T} = 3 is excluded by monotonicity.

We refer to the second stage of mediation (in which the exposure and mediator directly cause the disease) as the Y-stage. There are 4 relevant potential outcomes for this stage: Y_{11} (disease status when setting X = 1 and M = 1), Y_{01} (disease status when setting X = 0 and M = 1), Y_{10} (disease status when setting X = 1 and M = 0), and Y_{00} (disease status when setting X = 0 and M = 0). Each individual has a set of values for all of the above potential outcomes, which fixes the response type for this stage (Y-type, denoted Y^{T}) (Table 2). To label the Y-types, we use the response types for 2 exposures of interest, delineated in Rothman and Greenland, to define the variable Y^{T}.^{13,16,17} Again, preventive types are omitted, and labels are changed to reflect the outcome of interest (Y). This generates the following Y-types: doomed (Y^{T} = 1), parallel (Y^{T} = 2), X-causal (Y^{T} = 4), M-causal (Y^{T} = 6), synergistic (Y^{T} = 8), and immune (Y^{T} = 16); other types are excluded by monotonicity.

Combining the M-stage and Y-stage yields a more complex set of potential outcomes and response types. There are 4 relevant potential outcomes, which integrate both the M-stage and the Y-stage: Y_{1M1} (disease status when setting X = 1 and M = M_{1}), Y_{1M0} (disease status when setting X = 1 and M = M_{0}), Y_{0M1} (disease status when setting X = 0 and M = M_{1}), and Y_{0M0} (disease status when setting X = 0 and M = M_{0}). An individual's values for the above potential outcomes correspond to her response type with regard to both the mediator and disease (MY-type). Based on the combination of 3 M-types and 6 Y-types, there are 18 nonpreventive MY-types (Table 3). This is consistent with the number of response types given by Kaufman et al for mediation.^{9}

## DEFINITIONS OF DIRECT AND INDIRECT EFFECTS IN TERMS OF RESPONSE TYPES

We first define each of the direct and indirect effects using the potential outcomes and response types for mediation introduced in the previous section. We will present our results through the use of tables. To illustrate these results we will specifically focus on the blocked direct effect CDE(m=0).

Notation used for the algebraic manipulation of response types is given in eAppendix A (http://links.lww.com/EDE/A353). Briefly, the probability that an individual is of Y-type d (Y^{T} = d) is denoted by P(Y_{d} ^{T}); the joint probability that an individual is a given M-type b (M^{T} = b) and Y-type d (Y^{T} = d) is denoted by P(M_{b} ^{T} Y_{d} ^{T}); the probability that an individual is either Y-type d or e is denoted by P(Y_{de} ^{T}). Thus, for example, P(M_{2} ^{T} Y_{26} ^{T}) represents the proportion of the population that is both M-type 2 and either Y-type 2 or 6. Quantities such as P(Y_{1} ^{T} + M_{1} ^{T} Y_{26} ^{T}) indicate the probability that either an individual is of Y-type 1 or that the individual is of M-type 1 and Y-type 2 or 6. Y-types can be made conditional on exposure status and/or M-type; for example, the probability that an individual is doomed on Y (Y^{T} = 1), given X = 1 and M^{T} = 4 is denoted by P(Y_{1} ^{T}|X = 1,M_{4} ^{T}).

The definition of the blocked direct effect is given in the first line of Table 4. The blocked direct effect [CDE (m=0)] is the effect that exposure would have if the mediator were blocked (M = 0) for the entire population. It is a comparison between (1) the disease risk if everyone were exposed and mediator-negative [P(Y_{10} = 1)] and (2) the risk if everyone were unexposed and mediator-negative [P(Y_{00} = 1)].^{3,5,18} Using Table 2, this quantity can be translated into MY-types. The proportion who would get disease if everyone were exposed and mediator-negative (Y_{10} = 1) consists of those who are doomed (Y^{T} = 1), parallel (Y^{T} = 2), or X-causal (Y^{T} = 4). The proportion who would get disease if everyone were unexposed and mediator-negative (Y_{00} = 1) includes only those who are doomed (Y^{T} = 1). This yields:

Table 4 lists each of the direct and indirect effects in terms of both potential outcomes and response types.

## ESTIMATORS OF DIRECT AND INDIRECT EFFECTS IN TERMS OF RESPONSE TYPES

Several investigators have described estimators for direct and indirect effects, both natural and controlled, based on observable data.^{3–5,19} Estimators for each of the direct and indirect effects are presented in Table 5. These estimators presuppose the exposure X is randomized. Estimators are also available when X is not randomized but data are available on some set of covariates that suffice to control for confounding^{4–6}; (see also the Appendix to this paper). The formulas given in Table 5 constitute estimators and do not necessarily equal the actual effects of interest. To indicate that they are estimators we will place a “hat” symbol (ˆ) above them; for example, the estimator in Table 3 of the true blocked direct effect, CDE(m=0), we will denote by SYMBOL. The blocked direct effect [CDE(m=0)] is estimated by the difference between the risk in the exposed, mediator-negative individuals [P(Y = 1|X = 1,M = 0)] and the risk in the unexposed, mediator-negative individuals [P(Y = 1|X = 0,M = 0)].

The observed probabilities given in Table 5 can be translated into the MY-types that comprise these probabilities. Just as the probabilities are conditional on the subset of the population with a particular exposure, the MY-types are also conditional on observed subsets. For example, consider the first term: [P(Y = 1|X = 1,M = 0)]. The subset who are exposed and mediator-negative (X = 1 and M = 0) are those persons who do not have the mediator in the presence of exposure (X = 1 and M_{1} = 0). The proportion who have disease in this group is simply the risk in this group if exposed and mediator-negative [P(Y_{10} = 1|X = 1,M_{1} = 0)]. This term can be translated into response types. The response types that contribute to the probability [P(M_{1} = 0)] are immune types (M^{T} = 4); thus the observed probability is conditional on the subset who are exposed and immune on the mediator (X = 1,M^{T} = 4). The response types that comprise the probability [P(Y_{10} = 1)] are doomed (Y^{T} = 1), X-causal (Y^{T} = 4), and parallel (Y^{T} = 2). This yields the probability [P(Y_{124} ^{T} |X = 1,M_{4} ^{T})]. The same procedure is followed for the second term, yielding the following translation into potential outcomes and response types.

In eAppendix B (http://links.lww.com/EDE/A353) we give derivations to express all of the estimators given in Table 3 in terms of response types.

## DERIVATION OF ALTERNATIVE ASSUMPTIONS FOR THE IDENTIFICATION OF DIRECT AND INDIRECT EFFECTS

For each direct and indirect effect, we can finally derive assumptions that suffice for the estimators in Table 5 to identify the direct and indirect effects of interest. In eAppendix B, we show that the assumptions stated in the rows labeled “current” in Tables 6–9 suffice to identify the various direct and indirect effects of interest. The second of each pair of rows state the assumptions already present in the literature.^{3–5,9}

We once again illustrate our results by considering the blocked direct effect. We wish to determine the assumptions required for the estimated blocked direct effect SYMBOL, translated into MY-types above as [P(Y_{124} ^{T}|X = 1,M_{4} ^{T}) −P(Y_{1} ^{T}|X=0,M_{24} ^{T})], to equal the true blocked direct effect [CDE(m=0) = P(Y_{24} ^{T})].

If we make Assumption 1 in the first line of Table 6, the observable estimate of the blocked direct effect simplifies to:

If we make Assumption 2 in the first line of Table 6 then this yields the following simplification:

Under full exchangeabilitiy (randomization of X), we have:

The first of each pair of rows in Tables 6–9 give our new alternative assumptions, which suffice to identify each of the direct and indirect effects. Derivations are given in eAppendix B. Note that the assumptions for several pairs of direct and indirect effects are identical. For example, the blocked direct [CDE(m=0)] and blocked indirect [CIE(m=0)] effects require identical assumptions for valid estimation. This is because the total effect (of exposure on outcome) is the sum of the blocked direct and blocked indirect effects; if the total effect and blocked direct effect are estimated without bias, it follows that the estimates of the blocked indirect effect is also valid. Similarly, the pure direct effect and the total indirect effect sum to the total effect, as do the pure indirect effect and the total direct effect.^{5} This implies that the same assumptions are sufficient for unbiased estimation of these effects, as illustrated in eAppendix B.

The assumptions we have derived in terms of response types can also be stated in terms of potential outcomes (ie, using counterfactual notation). Thus Tables 6–9 present the assumptions we have derived not only in terms of response types but, equivalently, in terms of counterfactuals. In the Appendix of this paper we show that the assumptions given in Tables 6–9, stated in terms of counterfactuals, suffice to identify the direct and indirect effects of interest. The results presented in the Appendix also generalize the results in Tables 6–9 by (1) not requiring X and Y to be binary, (2) not assuming monotonicity, and (3) allowing for control of measured confounding variables C when exposure X is not randomized. The translation of our assumptions into counterfactual notation facilitates comparison of our current assumption with those previously proposed. We turn to this comparison shortly.

It is interesting to note the symmetry of assumptions across direct and indirect effects. All direct and indirect effects require 2 assumptions. First, the true risk of disease in a specific subgroup (defined by exposure and mediator status) must be equal to the risk for a different subgroup, if they were assigned to identical conditions. Specifically, the controlled effects require that a given risk is the same across exposure status (given a specified mediator value), while the natural effects require that a given risk is the same across mediator status (given a specified exposure value). Second, the true effect of either the exposure or mediator in one subgroup must be equal to the same effect in a second subgroup, if they were assigned to identical conditions. Specifically, the controlled effects require that the direct effect of exposure (given a specified mediator value) be the same across mediator status, while the natural effects require that the effect of the mediator (given a specified exposure value) be the same regardless of whether the mediator was caused by exposure.

## COMPARISON WITH PREVIOUSLY DERIVED ASSUMPTIONS

Tables 6–9 present not only the assumptions we have derived for the identification of direct and indirect effects but also those previously available in the literature.^{3–5,9} Assumptions are listed in terms of MY-types and in terms of potential outcomes. Current assumptions (found in the first of each pair of rows) can be directly compared with previously derived assumptions (found in the second row). Derivations relevant to these comparisons can be found in eAppendix C (http://links.lww.com/EDE/A353). Note that all assumptions (both current and prior) can be made conditional on one or more measured covariates; for simplicity, these additional covariates are not illustrated here. A general version of our results that allows for these measured covariates is given in the Appendix to this paper.

For illustration, we again consider the blocked direct effect. The current assumptions for the blocked direct effect differ substantially from those previously proposed, and thus represent an alternative set of assumptions for the unbiased estimation of these effects (Table 6). Our first assumption requires that the proportion who would get disease in the absence of exposure and mediator [P(Y_{00} = 1)] be equal in the unexposed, mediator-negative (X = 0,M = 0) and the exposed, mediator-negative (X = 1,M = 0) populations. In contrast, the assumption given by Robins and Greenland^{3} is that this proportion [P(Y_{00} = 1)] is the same in the unexposed, mediator-negative (X = 0,M = 0) and unexposed, mediator-positive (X = 0,M = 1) populations.

Our second assumption requires that the effect of exposure in the absence of the mediator [P(Y_{10} − Y_{00}=1)] be the same in the exposed, mediator-negative (X = 1,M = 0) and exposed, mediator-positive (X = 1,M = 1) populations. Robins and Greenland^{3} require this equality, not just for the effect of exposure [P(Y_{10} − Y_{00} = 1)], but for the entire risk of disease in this group [P(Y_{10} = 1)].

## IMPLICATIONS FOR VARIABLES THAT CONFOUND THE MEDIATOR-OUTCOME RELATIONSHIP

The differences between our assumptions and those previously derived have practical implications regarding the impact of confounding variables on the estimation of direct and indirect effects. In general, a common cause of the mediator and outcome will violate assumptions listed in Tables 6–9, thus biasing results. However, our new assumptions indicate that there are exceptions to this statement, depending on the degree to which the confounder interacts with the exposure (to cause the mediator and disease) and with the mediator (to cause disease). Again, we illustrate this phenomenon, using the blocked direct effect.

Consider the impact of a confounder of the mediator-disease relationship that does not interact with the exposure to cause the mediator. By this, we mean that there is no synergy in the sufficient-cause sense^{14} between exposure and confounder to cause the mediator. In the context of our initial example of mediation analysis (BMI → increased estrogen levels → breast cancer), an example of such a confounder might be a hypothetical genetic polymorphism. We postulate that this genetic polymorphism is associated with increased estrogen levels, while independently leading to an increased risk of breast cancer (Fig. 2). Assume that this genetic polymorphism and BMI act through entirely different and independent mechanisms to affect estrogen levels; in this case, there would be no synergy between the exposure and confounder to cause the mediator. In the absence of a true direct effect of exposure, such a confounder would lead to a violation of assumptions previously proposed for the CDE(m=0), but, as shown in eAppendix D (http://links.lww.com/EDE/A353), not to a violation of the current assumptions. This is most clearly illustrated using response types.

A confounder of the mediator-disease relationship that does not interact synergistically with the exposure to cause the mediator will lead to an imbalance of Y-types (Y^{T} _{1}, Y^{T} _{2}, or Y^{T} _{4}) across the M-type doomed (M^{T} _{1}) and other M-types (M^{T} _{24}). From Table 6, this violates the previously proposed Assumption 1. However, such a confounder would not lead to an imbalance of Y-types across M-type causal (M^{T} _{2}) and immune (M^{T} _{4}) types, and thus does not violate the current Assumption 1. This confounder may violate both versions of Assumption 2, depending on the degree of interaction between confounder and exposure to cause the disease. However, in the absence of a true direct effect of exposure [P(Y_{10} − Y_{00}) = 0 across subgroups], the current Assumption 2 will not be violated. In the absence of a true direct effect, our assumptions indicate that a common cause of the mediator and disease will not necessarily bias our estimate of the CDE(m=0). Further proof of these assertions is given in eAppendix D.

The assumptions we have derived can further facilitate the consideration of the magnitude and direction of bias that results in the presence of unmeasured confounders. In other work,^{18} using the assumptions we have derived, we have developed a sensitivity-analysis technique to assess the degree to which the violation of these assumptions might bias the observed results.

## SEQUENTIAL IGNORABILITY

Our results contrast with those previously derived in the literature in another interesting way. For natural direct and indirect effects, Assumption 2 of Pearl^{5} and Petersen et al^{4} (as stated in counterfactual notation in Tables 8 and 9) requires conditioning on a counterfactual. In contrast, our results make assumptions only about probabilities that are conditional on observed variables.

Our results imply that a particular version of “sequential ignorability” suffices to identify natural direct and indirect effects. Sequential ignorability is said to hold when data are available on a sufficiently rich set of covariates so that at each “treatment stage,” conditional on the covariates, there is no unmeasured confounding. If we make the sequential ignorability assumption here that, given a measured set of covariates, the value of M_{x} is independent of X, and Y_{xm} is independent of both X and M jointly, then it would follow that our newly derived assumptions would all be satisfied. In contrast, from the assumptions of Pearl^{5} and Petersen et al,^{4} it is not clear that sequential ignorability alone suffices to identify natural effects, since their assumptions are articulated as requiring independence statements that condition on counterfactual quantities. In the Appendix of this paper we show that our results presented in Tables 6–9 and our comments about sequential ignorability generalize so as to allow for nonbinary X and Y, so as to not require monotonicity, and so as to allow for control of confounding variables rather than assuming randomization.

However, the results stated in this paper do not identify natural direct and indirect effects when there is a consequence of the exposure that confounds the mediator-outcome relationship, even if data are available on the confounder. In such cases, additional assumptions may be necessary for identification. Indeed, Avin et al^{20} have shown that in this setting, although controlled direct effects may be identified, natural direct and indirect effect in general will not be identified.

## CONCLUSION

We have used response types for mediation to derive an alternative set of assumptions for the valid estimation of indirect and direct effects. While response types are mathematically equivalent to potential outcomes, using response types gives insight into the particular subgroups that, if comparable, allow for the valid estimation of direct and indirect effects. This has allowed us to derive alternative assumptions for the identification of direct and indirect effects. Analogous work on response types has been extremely valuable for understanding the assessment of main effects,^{12} interaction,^{13–15} and mediation.^{3,7,9} We have shown that this approach sheds yet further insight on mediation.

The assumptions we have derived are sometimes less stringent than those previously described. We have shown (in the Appendix) that our results generalize to settings in which only the mediator is binary. Our results imply that sequential ignorability suffices to identify not only controlled direct effects but also natural direct effects. Our results may also allow for a more nuanced reasoning about the bias arising from unmeasured confounding of the mediator-outcome relationship.

## ACKNOWLEDGMENTS

We thank Sharon Schwartz for her helpful insights and comments on earlier versions of this manuscript. We also thank the referees for their constructive comments.

## REFERENCES

1.Endogenous Hormones Breast Cancer Collaborative Group. Body mass index, serum sex hormones, and breast cancer risk in postmenopausal women.

*J Natl Cancer Inst*. 2003;95:1218–1226.

2.Hafeman DM, Schwartz S. Opening the black box: a motivation for the assessment of mediation.

*Int J Epidemiol*. 2009;38:838–845.

3.Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects.

*Epidemiology*. 1992;3:143–155.

4.Petersen ML, Sinisi SE, van der Laan MJ. Estimation of direct causal effects.

*Epidemiology*. 2006;17:276–284.

5.Pearl J. Direct and indirect effects. In:

*Proceedings of the American Statistical Association Joint Statistical Meetings*. Minneapolis, MN: MIRA Digital Publishing; 2001.

6.VanderWeele TJ. Marginal structural models for the estimation of direct and indirect effects.

*Epidemiology*. 2009;20:18–26.

7.Hafeman DM. A sufficient cause based approach to the assessment of mediation.

*Eur J Epidemiol*. 2008;23:711–721.

8.Robins JM. Semantics of causal DAG models and the identification of direct and indirect effects. In: Green P, Hjort N, Richardson S, eds.

*Highly Structured Stochastic Systems*. London: Oxford University Press; 2003:70–81.

9.Kaufman JS, Maclehose RF, Kaufman S. A further critique of the analytic strategy of adjusting for covariates to identify biologic mediation.

*Epidemiol Perspect Innov*. 2004;1:

10.Judd CM, Kenny DA. Process analysis: Estimating mediation in treatment evaluations.

*Evaluation Review*. 1981;5:602–619.

11.Cole SR, Hernan MA. Fallibility in estimating direct effects.

*Int J Epidemiol*. 2002;31:163–165.

12.Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding.

*Int J Epidemiol*. 1986;15:413–419.

13.Greenland S, Poole C. Invariants and noninvariants in the concept of interdependent effects.

*Scand J Work Environ Health*. 1988;14:125–129.

14.VanderWeele TJ, Robins JM. The identification of synergism in the sufficient-component-cause framework.

*Epidemiology*. 2007;18:329–339.

15.Darroch J. Biologic synergism and parallelism.

*Am J Epidemiol*. 1997;145:661–668.

16.Rothman KJ, Greenland S.

*Modern Epidemiology*. 2nd ed. Philadelphia: Lippincott Williams and Wilkins; 1998.

17.Miettinen OS. Causal and preventive interdependence. Elementary principles.

*Scand J Work Environ Health*. 1982;8:159–168.

18.Hafeman D.

*Opening the Black Box: A Reassessment Of Mediation From A Counterfactual Perspective* [dissertation]. New York: Columbia University; 2008.

19.Taylor JM, Wang Y, Thiebaut R. Counterfactual links to the proportion of treatment effect explained by a surrogate marker.

*Biometrics*. 2005;61:1102–1111.

20.Avin C, Shpitser I, Pearl J. Identifiability of path-specific effects. In: Proceedings of the International Joint Conference; 2005; Edinburgh, Scotland.

21.Imai K, Keele L, Yamamoto T. Identification, inference, and sensitivity analysis for casual mediation effects.

*Statistical Science*. In Press.

## APPENDIX. GENERAL ALTERNATIVE IDENTIFICATION RESULTS FOR DIRECT AND INDIRECT EFFECTS.

Let *Y* _{x ,m} denote a subject's potential outcome *Y* if, possibly contrary to fact, *X* were set to *x* and *M* were set to *m* and let *M* _{x} denote a subject's potential value of *M* if, possibly contrary to fact, *X* were set to *x*. We assume that the mediator *M* is binary but we do not require *X* and *Y* to be binary. We furthermore make no monotonicity assumptions. We do not assume that *X* is randomized but rather assume that there is some set of variables *C* such that the effects of *X* on *M* and *Y* are unconfounded. Our first theorem allows for the identification of controlled direct effects using alternative assumptions and generalizes the results we presented in Tables 6 and 7.

Theorem 1: Suppose for all *x*, *x**, and *m*,

and for some specific values *x*, *x**, and *m*,

then

(See Proof of Theorem 1 at the end of the Appendix)

Assumption (A1), that {Y_{x,m},M_{x*}}ЦX|C is a generalization of our randomization assumption. Assumptions (A2) and (A3) are generalizations of the assumptions presented in Tables 6 and 7. Under these assumptions, controlled direct effects are identified.

Our second theorem allows for the identification of natural direct and indirect effects using alternative assumptions and generalizes the results we presented in Tables 8 and 9.

Theorem 2: Suppose for all *x*, *x**, and *m*,

and for some specific values *x*, *x**, and *m*,

then

(See Proof of Theorem 2 at the end of the Appendix)

Assumptions (A4) and (A5) are generalizations of the assumptions presented in Tables 8 and 9. If (A1), (A4), and (A5) all hold then it follows from Theorem 2 that the pure natural direct effect is identified by

and the total natural indirect effect is identified by

To identify the total direct effect E[Y_{x,Mx}]−E[Y_{x*,Mx}] and the pure indirect effect E[Y_{x*,Mx}]−E[Y_{x*,Mx*}] we could reverse the roles of *x* and *x** in (A4) and (A5) in Theorem 2 and instead of (A4) and (A5) we could require

Then under (A1), (A4*), and (A5*), the total direct effect E[Y_{x,Mx}]−E[Y_{x*,Mx}] and the pure indirect effect E[Y_{x*,Mx}]−E[Y_{x*,Mx*}] are identified. The total direct effect is then given by

and the pure indirect effect is identified by

Note that (A4) and (A5) may hold without (A4*) and (A5*) holding, or vice versa. Consider also the assumption

Assumption (A1) and (A6) together imply Y_{xm}Ц{X,M}|C and this implies (A4), (A5), (A4*), (A5*) and also (A2) and (A3). Thus (A1) and (A6) together suffice to identify all natural direct and indirect effects. Assumption (A1) and (A6) together constitute a version of the assumption sometimes referred to as “sequential ignorability.”

We note that Imai et al.^{21} have recently obtained a result that also implies that natural direct and indirect effects are identified under (A1) and (A6). Imai et al use different exchangeability conditions than those given in Theorem 2. In particular, Imai et al assume (A1) but instead of

their results hold if

Note that (A7) is somewhat similar in form to (A5); condition (A7) for *m*= 1 implies (A5) but instead of requiring (A4), Imai et al require that (A7) hold not just for *m* = 1 but for all *m*. The result of Imai et al does not require the mediator *M* to be binary; however, neither our result nor Imai et al's result encompasses the other because the precise exchangeability conditions are different. We note that (A4) and (A5) and (A7) are all implied by (A1) and (A6). Thus our result and also Imai et al's result each

Thus we have that

imply that sequential ignorability (ie (A1) and (A6) together) are sufficient to identify natural direct and indirect effects.

We note, however, that neither our assumptions nor Imai et al's assumptions allow for the identification of natural direct and indirect effects in settings in which there is an effect of the exposure that confounds the mediator-outcome relationship, even if data is available on this confounder. In other words, if instead of (A1) and (A6) it is assumed that (A1) holds and that

holds where *Q* is an effect of *X* then neither our assumptions nor Imai et al's assumptions will suffice to identify natural direct and indirect effects. In fact, Avin et al^{20} have shown that if there is an effect of the exposure that confounds the mediator-outcome relationship then natural direct and indirect effects are not in general identified. In these cases, assumptions such as the no-interaction assumption^{10} or those made by Petersen et al^{4} will be needed if natural direct and indirect effects are to be identiied.

Proof of Theorem 1

Under (A1)–(A3) we have that

Thus we have that

Proof of Theorem 2

Under (A1), (A4), and (A5) we have that

## A Call for Nominations: The 2011 Rothman EPIDEMIOLOGY Prize

EPIDEMIOLOGY presents an annual award for the best paper published by the journal during the previous year. This prize of $3000 and a plaque goes to the author whose paper is selected by the Editors and the Editorial Board for its originality, importance, clarity of thought, and excellence in writing.

With this issue, we close our 2011 volume. We invite our readers to nominate papers published during the past year. Please e-mail your nominations to Allen Wilcox, Editor-in-Chief: editor@epijournal.org

Nominations must be received no later than 31 December 2011. The winner will be announced in our September 2012 issue and at the 2012 annual meeting of the American College of Epidemiology.

This award is made possible by an endowment from Hoffman-LaRoche Ltd., managed by the American College of Epidemiology.