It is often pointed out that many different biologic pathways or mechanisms can be consistent with the results of a statistical model1,2; in such cases it is not possible to draw conclusions about causal biologic mechanisms through statistical analysis. In certain simple settings, however, such as when the outcome and all exposures of interest are binary, it is possible to draw conclusions about mechanisms from empirical analyses provided there are no unmeasured confounding variables.3,4 In particular, VanderWeele and Robins4,5 recently derived empirical tests for synergism in the sufficient cause sense of Rothman.6 This form of synergism essentially implies joint presence of 2 causes in the same causal mechanism or sufficient cause. In this article we relate the empirical conditions of VanderWeele and Robins4,5 to interaction terms arising in standard statistical models. Linear, log-linear, and logistic models are all considered. In each case we use the empirical conditions of VanderWeele and Robins4,5 to derive conditions on model coefficients that suffice to conclude the presence of a sufficient cause interaction and we provide additional conditions under which the interactions in statistical models can be interpreted as the presence of a sufficient cause interaction. The remainder of the paper is organized as follows. We first summarize the sufficient component cause framework as conceptualized by Rothman6 and formalized by VanderWeele and Robins4,5 and give the empirical conditions of VanderWeele and Robins4,5 that suffice to conclude the presence of a sufficient cause interaction. We then relate sufficient cause interactions first to linear statistical models, and then in the following section to interaction terms in log-linear and logistic models; 2-way sufficient cause interactions are discussed explicitly in the text whereas extensions to 3-way interactions are given in the Appendix. We next consider the implications of the presence of confounding variables in statistical models for inference about sufficient cause interactions and we close with some general discussion.
Sufficient Causes and Sufficient Cause Interactions
Rothman6 conceptualized causation as a collection of different causal mechanisms, each sufficient to bring about the outcome. These causal mechanisms Rothman called “sufficient causes” and conceived of them as minimal sets of actions, events, or states of nature that together initiated a process that inevitably resulted in the outcome. For a particular outcome there would likely be many different sufficient causes, that is, many different causal mechanisms by which the outcome could come about. Each sufficient cause involved various component causes. Whenever all components of a particular sufficient cause were present, the outcome would inevitably occur; within every sufficient cause, each component would be necessary for that sufficient cause to lead to the outcome. If 2 distinct causes are both components of the same sufficient cause, then the causes participate together in the same causal mechanism and synergism is said to be present. Often there will be several primary causes of interest and other background causes will be necessary to complete the sufficient causes. We use Ai to denote the background causes for the ith sufficient cause. Consider, for example, the case of 2 binary causes X1 and X2 for some outcome D. Each sufficient cause may involve background causes as well as either or both of X1 and X2 or the complements of X1 and X2, which we will denote by 1 and 2. In the case of 2 binary causes, Greenland and Poole7 thus enumerate 9 different sufficient causes: A1, A2X1, A31, A4X2, A52, A6X1X2, A7X1X2, A8X12, and A912. For a particular outcome D, only some of these sufficient causes might be present; for example, if the presence of X1 or X2 can never prevent the outcome, then none of the sufficient causes with 1 or 2 will be present, that is, none of A31, A52, A71X2, A8X12, and A911 will be present and the only possible sufficient causes will be A1, A2X1, A4X2 and A6X1X2. When none of the causes of interest X1 and X2 can ever prevent the outcome, we will say that the effects of X1 and X2 on D are monotonic. If we let Dx1x2 denote the counterfactual value of D after intervening to set X1 = x1 and X2 = x2, then the effects of X1 and X2 on D are monotonic if Dx1x2 is nondecreasing in x1 and x2. The equivalence of the definitions of monotonicity based on counterfactuals and on sufficient causes is discussed elsewhere.8
If, for the ith sufficient cause, no background causes are necessary, then Ai = 1. If for example the outcome D always occurred whenever X1 = 1 and X2 = 1, then the 6th sufficient cause in the list would be X1 X 2 rather than A6 X 1 X 2. Now instead suppose that for all individuals D = 1 if and only if either X1 = 1 or X2 = 1. Greenland and Brumback9 note that several different sets of sufficient causes could represent this response pattern. For example, if there were 3 sufficient causes, X1 X 2, 1X2, and X12 this would replicate the response pattern. However the 2 sufficient causes, X1 and X2, would also replicate the response pattern. VanderWeele and Robins5 formally defined a sufficient cause interaction to be present between X1 and X2 (or more generally between X1,…,Xk) if for every set of sufficient causes that replicates the response patterns there is a sufficient cause in which X1 and X2 are both present (or more generally in which X1,…,Xk are all present). Thus if a sufficient cause interaction between X1 and X2 is present then there must be some mechanism, which the sufficient cause represents, which requires the presence of both X1 and X2 to operate.
VanderWeele and Robins4,5 furthermore derived empirical conditions that were sufficient to conclude that a sufficient cause interaction was present. Let px1x2 = E(D|X1 = x1, X2 = x2). It was shown that for a binary outcome D and 2 binary exposures X1 and X2, if the effects of X1 and X2 on D are unconfounded then if
then a sufficient cause interaction must be present between X1 and X2. It was further shown if the effects of X1 and X2 on D are monotonic (ie, if neither X1 and X2 can ever prevent the outcome) then if
then a sufficient cause interaction must be present between X1 and X2. If we let
denote the relative risk that D = 1 when X1 = x1 and X2 = x2 then provided p00 > 0, condition 1 can be rewritten as RR11 − RR10 − RR01 > 0 and condition 2 can be rewritten as RR11 − RR10 − RR01 + 1 > 0. Condition 2 is simply a condition for “superadditivity” or positive interaction on the risk difference scale; however, it is only applicable for sufficient cause interactions when the effects of both X1 and X2 on D are monotonic. Condition 1 is a stronger condition and applies without making assumptions about monotonicity. The intuition for condition 1 is that, if it holds, then there must be some individuals for whom D11 = 1 but D10 = D01 = 0 and, if this is the case, then there must be a sufficient cause with both X1 and X2 present.5
Extensions to 3-way sufficient cause interactions were also given by VanderWeele and Robins.5 For 3 binary exposures X1, X2, and X3, let px1x2x3 = E(D|X1 = x1, X2 = x2, X3 = x3). Suppose that the effects of X1, X2, and X3 on D are unconfounded then if
then a sufficient cause interaction must be present between X1, X2, and X3. Finally if the effects of X1, X2, and X3 on D are monotonic then any of the following 3 conditions imply that a sufficient cause interaction is present between X1, X2, and X3:
If we let
;)
denote the relative risk that D = 1 when X1 = x1, X2 = x2 and X3 = x3 then provided p000 > 0, condition 3 can be rewritten as RR111 − RR110 − RR101 − RR011 > 0 and similarly the conditions given in 4 can also be rewritten in terms of relative risks. If the effects of {X1, X2} or {X1, X2, X3} on D are unconfounded conditional on some set of covariates C, then conditions 1–4 can also be made conditional on C. A sufficient cause interaction is then present if the conditions hold in any stratum of the confounding variables C. In the context of no confounding factors, the fact that condition 2 was sufficient to conclude the presence of a sufficient cause interaction was stated explicitly and proved by Rothman and Greenland3 and it was also anticipated elsewhere.10,11 Theory concerning sufficient causes developed in VanderWeele and Robins5 was necessary to derive conditions 1, 3, and 4. Note that these are sufficient conditions for a sufficient cause interaction, that is, if they hold then a sufficient cause interaction must be present. They are not, however, necessary conditions; a sufficient cause interaction might be present even if they do not hold. See VanderWeele and Robins4,5 for further discussion.
Sufficient Cause Interaction and Linear Statistical Models
In this section we will relate sufficient cause interactions to interactions arising in linear statistical models. The discussion follows from that given in VanderWeele and Robins.5 For simplicity we will assume that the causal effects of {X1, X2} or alternatively {X1, X2, X3} are unconfounded. Below we will consider also settings in which the effects of the exposures of interest are confounded by some set of measured confounding variables C. Consider first the setting of 2 causes of interest, X1 and X2. To model disease with a linear statistical model one could use a saturated Bernoulli regression model:
;)
Because X1 and X2 are binary and 4 coefficients are in the model, the model will fit the conditional probabilities of the outcomes perfectly and the model is said to be saturated. In this linear model, one would test for a statistical interaction by testing the hypothesis α3 = 0. We will consider first the case of monotonic effects. If X1 and X2 have monotonic effects on D, condition 2 states that if the effects of X1 and X2 on D are monotonic, then if p11 − p10 − p01 + p00 > 0, then there is a sufficient cause interaction between X1 and X2. We may write this condition in terms of the coefficients in the linear statistical model 5 as follows:
;)
Thus in the case of monotonic effects, if α3 > 0 then a sufficient cause interaction is necessarily present between X1 and X2. When the effects of X1 and X2 on D are not monotonic we need condition 1, that p11 − p10 − p01 > 0, to be able to conclude the presence of a sufficient cause interaction. We may also rewrite this condition in terms of the coefficients in the linear statistical model 5:
In this case, we need that α3 > α0 to be able to conclude the presence of a sufficient cause interaction. Note that these statements concern the true parameters. In practice, of course, the parameters will have to be estimated from data, and inference concerning the true parameters drawn from the estimates and their confidence intervals.
We see then that a test for a statistical interaction only implies a test for a sufficient cause interaction in the case of monotonic effects, not in general, and that even with monotonic effects a statistical interaction implies a sufficient cause interaction only in the case of a positive interaction (α3 > 0); a negative interaction (α3 < 0) does not suffice. Without the assumption of monotonic effects, a test for a sufficient cause interaction, α3 > α0, will only be implied by a test for a statistical interaction, α3 > 0 if α0 = p00 = 0, that is, if the baseline risk for the outcomes when both exposures X1 and X2 are absent is 0. In the Appendix we also relate conditions for 3-way sufficient cause interactions, conditions 3 and 4, to 3-way statistical interactions in linear models. There it is shown that for 3-way sufficient cause interactions neither the condition with monotonicity, condition 3, nor the condition without monotonicity, condition 4, is implied by a test for the presence of a 3-way statistical interaction in a linear model for the probability of the outcome.
Sufficient Cause Interaction and Log-Linear and Logistic Models
In this section we will relate sufficient cause interactions to interaction terms arising in log-linear and logistic models. There is a substantial epidemiologic literature on using log-linear and logistic models in tests for the additivity of effects.12–15 Here we consider some of the implications of this literature for testing for sufficient cause interactions. Consider the following saturated log-linear model for the probability of the outcome:
From this model it follows that p11 = eβ0+β1+β2+β3, p10 = eβ0+β1, p01 = eβ0+β2 and p00 = eβ0. Condition 2, which suffices to conclude the presence of a sufficient cause interaction under the monotonicity assumption, can be rewritten in terms of the coefficients in model 6 as follows:
which can be rewritten as
This condition can be used to test for a sufficient cause interaction between X1 and X2 by using model 6 provided the effects of X1 and X2 on D are monotonic. Note that the quantity given in 7 is what Rothman12 defined as the relative excess risk due to interaction (RERI) and which Rothman and Greenland3 call the interaction contrast or ICR. Note that the condition RERI > 0 implies that there is a sufficient cause interaction only if the effects of X1 and X2 on D are monotonic.
We will now characterize the conditions such that a test for a statistical interaction, β3 > 0, in the log-linear model 6 corresponds to a test for a sufficient cause interaction. Suppose that β1 ≥ 0. We can rewrite condition 7 as
Suppose β3 > 0; then because β1 ≥ 0 we have that eβ1(eβ2+β3−1)≥(eβ2+β3−1)>(eβ2−1) and thus condition 8 must be satisfied and a sufficient cause interaction between X1 and X2 must be present. By symmetry the conclusion would also hold if β2 ≥ 0 and β3 > 0. Note that because
and
the conditions β1 ≥ 0 and β2 ≥ 0 are necessarily satisfied, as we have assumed that the effects of X1 and X2 on D are monotonic. We have thus established the following result.
Result 1. Suppose the effects of X1 and X2 on D are monotonic and unconfounded. If in model 6, β3 > 0 then there is a sufficient cause interaction between X1 and X2.
Thus, under monotonicity, a test for a statistical interaction in model 6, β3 > 0, implies a test for a sufficient cause interaction. It can similarly be verified that if β1 > 0 and β2 > 0 then β3 ≥ 0 implies that there is a sufficient cause interaction between X1 and X2. Thus if β1 > 0 and β2 > 0 then a sufficient cause interaction will be present even if β3 = 0.16 It is thus also clear then that a sufficient cause interaction can be present even under a multiplicative model, that is, if β3 = 0 so that log (P(D = 1|X1 = x1, X2 = x2)) = β0 + β1x1 + β2 x 2 and
;)
that is, p11 p 00 = p10 p 01. Following VanderWeele and Robins,4 suppose that in some study it is found that the relative risk of lung cancer with only an asbestos exposure is 3, the relative risk of lung cancer with only the smoking exposure is 10 and the relative risk of lung cancer with both the asbestos and the smoking exposure is 30, then the risks are multiplicative and thus β3 = 0. But from this information alone, provided the effects of asbestos and smoking on lung cancer are monotonic, 1 can conclude that a sufficient cause interaction between asbestos and smoking must be present because
and similarly
Thus, assuming that our estimates are unconfounded and that the effects of asbestos and smoking on lung cancer are monotonic we could conclude that there must be a causal mechanism that requires both the exposure to smoking and to asbestos to operate.
We now consider the case when it cannot be assumed that the effects of X1 and X2 on D are monotonic. Condition 1, which suffices to conclude the presence of a sufficient cause interaction without the monotonicity assumption, can be rewritten in terms of the coefficients in model 6 as follows:
which can be rewritten as
This condition can be used to test for a sufficient cause interaction between X1 and X2 by using model 6 even when the effects of X1 and X2 on D are not monotonic. Note that condition 9 can be rewritten as RERI > 1.
We will now characterize the conditions such that a test for a statistical interaction, β3 > 0, in the log-linear model in 6 corresponds to a test for a sufficient cause interaction. We can rewrite condition 9 as
or as
Clearly if (½)eβ1+β3−1>0 and (½)eβ2+β3−1>0 then condition 10 will be satisfied. These 2 conditions we can rewrite as eβ3>2e−β1 and eβ3>2e−β2 or as β3 > log(2) − β1 and β3 > log(2) − β2. We have thus established the following result.
Result 2. Suppose the effects of X1 and X2 on D are unconfounded. If in model 6 both β3 > log(2) − β1 and β3 > log(2) − β2 then there must be a sufficient cause interaction between X1 and X2.
If β1 ≥ log(2) and β2 ≥ log(2) then the conditions in result 2 will be satisfied if β3 > 0. Let
and
denote the relative risks when either only exposure X1 or X2, respectively, is present. Note that
and
and so the conditions β1 ≥ log(2) and β2 ≥ log(2) are simply that RR10 ≥ 2 and RR01 ≥ 2. Thus if both RR10 ≥ 2 and RR01 ≥ 2 then a test for a statistical interaction in model 6, β3 > 0, corresponds to a test for a sufficient cause interaction. If either of the conditions RR10 ≥ 2 and RR01 ≥ 2 does not hold then the condition for a statistical interaction, β3 > 0, does not in general imply the presence of a sufficient cause interaction.
Also, note that by Result 2 if β1 ≥ 0 and β2 ≥ 0 then β3 > log(2) implies the presence of a sufficient cause interaction. Note that Rothman et al16 use the results of VanderWeele and Robins4,5 to show that if RR10 > 2 and RR01 > 2, that is, if the inequalities are strict, then a sufficient cause interaction must be present even if β3 = 0. Result 2 generalizes this observation of Rothman et al.16 Note also that a sufficient cause interaction might be present even if β3 < 0. For example, β3 might be negative and yet we might have either β3 > log(2) − β1 and β3 > log(2) − β2 if β1 or β2 are sufficiently large.
In the smoking and asbestos example, if we were unwilling to assume that the effects of smoking and asbestos on lung cancer were monotonic we could still conclude the presence of a sufficient cause interaction by result 2 because
Consider now the case of logistic regression. A saturated logistic model for the probability of the outcome is given by:
Conditions 1 and 2 can be respectively written in terms of the coefficients in model 11 as follows:
and
These conditions can be used to test for a sufficient cause interaction between X1 and X2 by using the logistic model given in 11. However, there appear to be no simple conditions on the coefficients γ0, γ1, and γ2 that imply that whenever γ3 > 0 condition 12 or 13 is satisfied. Nevertheless, if the outcome is sufficiently rare so that the odds ratio closely approximates the risk ratio, so that,
then the discussion above will apply also to the coefficients in model 11. That is if the outcome is sufficiently rare then,
implies the presence of a sufficient cause interaction if the effects of X1 and X2 on D are monotonic and then a test of γ3 > 0 implies a test for a sufficient cause interaction. Also if the effects of X1 and X2 on D are not monotonic then,
implies the presence of a sufficient cause interaction and if γ1 ≥ log(2) and γ2 ≥ log(2) then a test of γ3 > 0 implies a test for a sufficient cause interaction; if γ1 ≥ 0 and γ2 ≥ 0 then a test of γ3 > log(2) implies a test for a sufficient cause interaction. Provided the outcome is rare these tests for the coefficients of a logistic regression model could be used to test for sufficient cause interactions by using data arising from case-control studies. In the Appendix we also relate conditions for 3-way sufficient cause interactions, that is, conditions 3 and 4, to 3-way statistical interactions in log-linear and logistic models.
Implications of Confounding Variables
Our discussion thus far has assumed that no confounding variables are present, that is, that the exposures X1 and X2 are effectively randomized. As noted above, if the effects of X1 and X2 are unconfounded given some set of variables C the conditions derived in VanderWeele and Robins4,5 can be made conditional on C. For example, let px1x2c = E(D|X1 = x1, X2 = x2, C = c), then condition 1 becomes
and condition 2 becomes
If these conditions hold in any strata C = c, this suffices to conclude the presence of a sufficient cause interaction. However, if 1 or more of the confounding variables in C is continuous, this can raise difficulties in the models we have been considering. The models we have considered thus far have all had only binary variables and have all been saturated models so that they fit the conditional probabilities of the outcomes perfectly. With a continuous covariate this will not be possible, and one will need to specify a model that will impose certain additional distributional assumptions. Tests for sufficient cause interactions will be valid only if the assumption of no unmeasured confounders holds true (at least approximately) and if the model is correctly specified. With saturated models there was no danger of misspecification.
It is also well known that when a Bernoulli model with linear or log-linear link is used with 1 or more continuous covariates C, the convergence properties of maximum likelihood estimators are generally poor17 and fitted probabilities can lie outside the range of [0, 1]. Even if the parameter estimates do converge and if the fitted probabilities are all between 0 and 1, the tests for sufficient cause interaction in certain cases may be quite sensitive to model specification.
Let C denote a set of confounding variables. Suppose that the following linear model for the probability of the outcome is used:
where f is some function of the parameter α4 and the confounding variables c. Note that C may be multivariate and α4 may denote a vector of coefficients. Condition 15 can be written in terms of the regression coefficients in model 16 as follows:
Condition 14 can be written in terms of the regression coefficients in model 16 as:
Note that unlike condition 17, condition 18 in fact depends on the value of c and is thus sensitivity to the specification of f(α4, c).
Consider now a log-linear model for the probability of the outcome that includes the confounding variables C:
It can be shown that model 19 effectively implies that a change in C from c to c* multiplies the likelihood of each of the background causes (the Ai variables) being present by a factor of eg(β4,c*)−g(β4,c). Condition 15 can be written in terms of the regression coefficients in model 19 as:
which be rewritten as eβ0+β1+β2+β3−eβ0+β1−eβ0+β2+1>0. Similarly, condition 14 can be written in terms of the regression coefficients in model 19 as:
which be rewritten as eβ0+β1+β2+β3−eβ0+β1−eβ0+β2>0. Neither of these 2 expressions depends on c nor on the specification of g(β4, c). If, however, C served also as an effect modifier for the effect of X1 or X2 on D on the log-linear scale so that the model for the conditional probabilities of the outcome was in fact
;)
then it is easy to verify that conditions 14 and 15 will once again depend on c and on the specification of g1(β5, c) and g2(β6, c). Clearly, with continuous confounding variables, the conclusions drawn about sufficient cause interaction will be sensitive to model specification. The issues of confounding and model specification are common to almost all research with observational data. Further work, however, remains to be done in determining how sensitive tests for sufficient cause interactions are to unmeasured confounding and model specification. In recent work, multiply robust tests have been derived for sufficient cause interactions that will be valid if either a model for the outcome is correctly specified or if a model for the joint probability of the exposures is correctly specified.18
DISCUSSION
We have focused here on linear, log-linear, and logistic models and have compared and contrasted tests for sufficient cause interactions to tests for interactions in statistical models. Only when the effects of X1 and X2 on D were monotonic did a test for a statistical interaction in linear and log-linear models correspond to a test for a sufficient cause interaction. In other cases, certain additional conditions on other regression coefficients were necessary for a test for a statistical interaction in a linear, log-linear, or logistic model to correspond to a test for a sufficient cause interaction.
Although our focus in this paper has been on the relationship between statistical interactions and sufficient cause interactions and has not been on estimation or modeling, the models we have discussed could be fit to data and used in testing for sufficient cause interactions. Logistic regression models are of course routinely used in epidemiologic research. As noted above, when the outcome is rare, the relative excess risk due to interaction (RERI) can be used to test for sufficient cause interactions. If the effects of the exposures on the outcome are monotonic then RERI > 0 implies the presence of a sufficient cause interaction. If it cannot be assumed that the effects of the exposures on the outcome are monotonic then a stronger condition, RERI > 1, can still be used to test for the presence of a sufficient cause interaction. This approach to testing for sufficient cause interactions by using logistic models could also be employed with case-control data. Linear and log-linear models are used less frequently for binary outcomes because of the convergence and fitting issues previously mentioned. Nevertheless, several authors discuss how such linear and log-linear Bernoulli regression models can be fit.17,19–21 Other approaches to estimating risk differences and relative risks are also available and could be used in the context of tests for sufficient cause interactions. Zou22 discusses using modified Poisson regression models to estimate risk ratios while controlling for covariates. Skrondal23 discusses the advantages of using linear odds models rather than logistic regression models to assess departures from additivity in case-control studies. Cheung24 discusses using a modified least squares regression approach to estimating risk differences while controlling for covariates. Lumley et al25 have recently compared several different approaches to estimating relative risks.
Moving from conclusions about statistical models to conclusions about causation and mechanisms always require certain assumptions. To move from conclusions about association to conclusions about causation we first need that the assumption of no unmeasured confounding holds and second that the statistical model that is used is correctly specified. To draw conclusions about mechanisms, further conditions are needed. In this paper, we have shown that even if the assumption of no unmeasured confounding is met and if the statistical model is correctly specified, interactions terms in statistical models do not in general correspond to interaction or synergism in the sufficient cause sense, that is, to the joint presence of 2 causes in the same causal mechanism. We have, however, provided the appropriate conditions to allow one to conclude the presence of sufficient cause interactions from the coefficients of a statistical model.
REFERENCES
1. Siemiatycki J, Thomas DC. Biological models and statistical interactions: an example from multistage carcinogenesis.
Int J Epidemiol. 1981;10:383–387.
2. Thompson WD. Effect modification and the limits of biological inference from epidemiologic data.
J Clin Epidemiol. 1991;44:221–232.
3. Rothman KJ, Greenland S.
Modern Epidemiology. 2nd ed. Philadelphia: Lippincott-Raven; 1998.
4. VanderWeele TJ, Robins JM. The identification of synergism in the sufficient-component cause framework.
Epidemiology. 2007;18:329–339.
5. VanderWeele TJ, Robins JM. Empirical and counterfactual conditions for sufficient cause interactions.
Biometrika. 2008;95:49–61.
6. Rothman KJ. Causes.
Am J Epidemiol. 1976;104:587–592.
7. Greenland S, Poole C. Invariants and noninvariants in the concept of interdependent effects. Scand.
J Work Environ Health. 1988;14:125–129.
8. VanderWeele TJ, Robins JM. Minimal sufficient causation and directed acyclic graphs.
Ann Statist. In press.
9. Greenland S, Brumback B. An overview of relations among causal modelling methods.
Int J Epidemiol. 2002;31:1030–1037.
10. Koopman JS. Interaction between discrete causes.
Am J Epidemiol. 1981;113:716–724.
11. Darroch JN, Borkent M. Synergism, attributable risk and interaction for two binary exposure factors.
Biometrika. 1994;81:259–270.
12. Rothman KJ.
Modern Epidemiology. 1st ed. Boston, MA: Little, Brown and Company; 1986.
13. Hosmer DW, Lemeshow S. Confidence interval estimation of interaction.
Epidemiology. 1992;3:452–456.
14. Assman SF, Hosmer DW, Lemeshow S, et al. Confidence intervals for measures of interaction.
Epidemiology. 1996;7:286–290.
15. Knol MJ, van der Tweel I, Grobbee DE, et al. Estimating interaction on an additive scale between continuous determinants in a logistic regression model.
Int J Epidemiol. 2007;36:1111–1118.
16. Rothman KJ, Greenland S, Lash TL.
Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008.
17. Wacholder S. Binomial regression in GLIM: estimating risk ratios and risk differences.
Am J Epidemiol. 1986;123:174–184.
18. Vansteelandt S, VanderWeele TJ, Robins JM. Multiply robust inference for statistical interactions.
J Am Statist Assoc. In press.
19. Greenland S. Estimating standardized parameters from generalized linear models.
Stat Med. 1991;10:1069–1074.
20. Greenland S. Model-based estimation of relative risks and other epidemiologic measures in studies of common outcomes and in case-control studies.
Am J Epidemiol. 2004;160:301–305.
21. Spiegelman D, Hertzmark E. Easy SAS calculations for risk or prevalence ratios and differences.
Am J Epidemiol. 2005;162:199–200.
22. Zou G. A modified Poisson regression approach to prospective studies with binary data.
Am J Epidemiol. 2004;159:702–706.
23. Skrondal A. Interaction as departure from additivity in case-control studies: a cautionary note.
Am J Epidemiol. 2003;158:251–258.
24. Cheung YB. A modified least-squares regression approach to the estimation of risk difference.
Am J Epidemiol. 2007;166:1337–1344.
25. Lumley T, Kronmal R, Ma S. Relative risk regression in medical research: models, contrasts, estimators, and algorithms. UW Biostatistics Working Paper Series. Working Paper 293. Available at:
http://www.bepress.com/uwbiostat/paper293.
APPENDIX
Three-Way Sufficient Cause Interactions in Linear Models
Here we relate 3-way sufficient cause interactions to interaction terms in linear statistical models following the discussion in VanderWeele and Robins.5 Consider the following saturated linear model for the probability of the outcome with 3 binary variables X1, X2, and X3:
Under the assumption that the effects of X1, X2, and X3 on D are monotonic we can rewrite the conditions given in 4 in terms of the coefficients of the linear probability model given in 20. The 3 conditions become
If the effects of X1, X2, and X3 on D are monotonic and any of these 3 conditions are satisfied then there must be a sufficient cause interaction between X1, X2, and X3. In this case, a test for a 3-way statistical interaction, α7 > 0, will imply a test for a 3-way sufficient cause interaction only if 1 of α1, α2, or α3 is 0. If it cannot be assumed that the effects of X1, X2, and X3 on D are monotonic we may use condition 3 which can be rewritten in terms of the coefficients of the linear probability model given in 20 as
;)
If this condition is satisfied then there must be a sufficient cause interaction between X1, X2, and X3 even if it cannot be assumed that the effects of X1, X2, and X3 on D are monotonic. In this case, a test for a 3-way statistical interaction, α7 > 0, will imply a test for a 3-way sufficient cause interaction only if 2α0 + α1 + α2 + α3 ≤ 0. Thus, for 3-way sufficient cause interactions neither the condition with monotonicity, condition 3, nor the condition without monotonicity, condition 4, is in general implied by a test for the presence of a 3-way statistical interaction, α7 > 0, in a linear model for the probability of the outcome. Stronger conditions are needed both with and without monotonicity.
Three-Way Sufficient Cause Interactions in Log-Linear and Logistic Models
Here we relate 3-way sufficient cause interactions to interaction terms in log-linear and logistic models. Consider the following saturated log-linear model for the probability of the outcome:
Consider first the condition for a 3-way sufficient cause interaction without the assumption that the effects of X1, X2, and X3 on D are monotonic. Condition 3 can be written in terms of the coefficients in model 21 as
which can be rewritten as
It is then easily verified that if β3 + β5 + β6 > log(3) and β2 + β4 + β6 > log(3) and β1 + β4 + β5 > log(3) and β7 > 0 then condition 22 is satisfied. Thus if β3 + β5 + β6 > log(3) and β2 + β4 + β6 > log(3) and β1 + β4 + β5 > log(3), then a test for a 3-way statistical interaction, β7 > 0, implies a 3-way sufficient cause interaction.
Now suppose that the effects of X1, X2, and X3 on D are monotonic. Consider the first condition in 4, p111 – p110 – p101 – p011 + p100 + p010 > 0. This can be written in terms of the coefficients in model 21 as
which can in turn be rewritten as
or as
where
and
. It can then be verified that if β3 + β5 + β6 > log(3 – c1) and β2 + β4 + β6 > log(3 – c2) and β1 + β4 + β5 > log(3 – c3) and β7 > 0 then condition 23 is satisfied. Thus, if the effects of X1, X2, and X3 on D are monotonic and if β3 + β5 + β6 > log(3 – c1) and β2 + β4 + β6 > log(3 – c2) and β1 + β4 + β5 > log(3 – c3) then a test for a 3-way statistical interaction, β7 > 0, implies a test for a 3-way sufficient cause interaction. Note that since c1, c2, and c3 are all positive quantities, the conditions required under monotonicity for a statistical interaction to correspond to a sufficient cause interaction are weaker than when monotonicity cannot be assumed. Similar implications hold for the other 2 conditions in 4.
As noted previously, if the outcome is sufficiently rare so that the odds ratio closely approximates the risk ratio then these remarks concerning 3-way sufficient cause interactions and statistical interactions in log-linear models apply also to logistic models.