A rose by any other name may smell as sweet, but this is no license to ignore the importance of precise language in scientific writing. Many of the technical terms used in epidemiology have alternate meanings in common parlance. Statisticians are particularly clever at transforming mundane words into apt metaphors (eg, “bootstrap,” “kernel,” “jackknife”). But, multiple meanings can also create confusion. Our use of the word “bias” frequently raises eyebrows among nonepidemiologists. To “confound” means something different in ordinary English than in the context of epidemiology.
The term “interaction” is a minefield of potential misunderstanding. Its use in statistical modeling is thoroughly entrenched and yet distinct from its connotations in both epidemiology and in common English conversation. What statisticians mean is simply a deviation from the specified model form, a deviation that can be accommodated by a product term between 2 or more right-hand-side variables. These variables do not “interact” in any of the meanings suggested by English dictionaries.
It is at the boundary of statistics and epidemiology—that is, in our interpretation of statistical models as causal or mechanistic statements—that this miscommunication is most dangerous. And yet it is causal interpretation that motivates most epidemiologic investigation. As Knol et al point out in this issue of the journal,1 the presentation and discussion of interaction in the medical and epidemiologic literature is woefully inadequate. What can be done to improve this?
Greenland has recently recommended that we abandon the word “interaction” altogether.2 This proposal is not unreasonable (and can be adopted by any sympathetic author). It is not, however, a solution that the editors of Epidemiology are inclined to advocate. Statisticians are unlikely to alter their deep-rooted usage, and the potential for misunderstanding will persist.
We prefer instead that authors clarify their use of that ambiguous word “interaction” by adding a qualifier. Interactions can be “additive” or “multiplicative” or, better yet, described as deviations from additive or multiplicative joint effects. Deviations from additivity are of interest from the perspective of public health or health services, because they identify subpopulations that would most benefit from potential interventions.2 Deviation from multiplicativity is convenient to compute but not as easy to defend substantively. Although multiplicative interaction can have important epidemiologic implications,3 such interaction has little direct bearing on “synergism” in the biologic sense, or on interdependence of latent causal types–at least not without important additional assumptions.4,5 Where a mechanistic interpretation is intended, as in the phrase “gene–environment interaction,” Greenland's more radical suggestion to dump the word entirely is perhaps justified.
Beyond simply qualifying the type of interaction being discussed, there are ways to improve practice along the lines suggested by Knol et al.1 The somewhat arbitrary choice of model form determines the default scale for joint effects.6 Analysts with limited power and a passion for parsimony will tend toward the joint-effects model suggested by their selected link function rather than by their data.2 Greenland's suggestion to prune our models less zealously would give data a greater role in determining the fit. The modeled values can then be reported more completely, allowing readers to examine joint effects on both common scales.1 At Epidemiology, we encourage authors who report “interaction” to present the individual effects of both of the exposures and their joint effects–each relative to the group not exposed to either factor. An equally welcome approach would be to report the relevant parameters from a regression model (ie, the individual coefficients for both exposures and their product term).
The most egregious error described by Knol et al is the declaration of effect heterogeneity when the estimated effect is “statistically significant” in one subgroup but not “statistically significant” in another. Although Knol et al found few articles in the selected journals that made this inference overtly, this illogic is not uncommon in the wider literature.7 Our long-standing policy of discouraging significance tests helps avoid this error (among others).8 Nonetheless, we began to allow P values for interaction on the basis of several arguments, one being that there is no relevant parameter of interest to report.9
Recent articles on interaction now lead us to conclude that we can do better. VanderWeele4 has demonstrated that the coefficient on the product term is indeed a parameter of interest when considering joint effects. Its sign and magnitude can have important implications beyond its (statistical) “significance.” Authors can expect to see evidence of our evolving perspective on these issues in their (editorial) interactions with the journal.
1. Knol MJ, Egger M, Scott P, et al. When one depends on the other: reporting of interaction in case-control and cohort studies. Epidemiology
2. Greenland S. Interactions in epidemiology: relevance, identification, and estimation. Epidemiology
3. Thompson WD. Effect modification and the limits of biological inference from epidemiologic data. J Clin Epidemiol
4. VanderWeele TJ. Sufficient cause interactions and statistical interactions. Epidemiology
5. VanderWeele TJ, Robins JM. The identification of synergism in the sufficient-component-cause framework. Epidemiology
6. Vandenbroucke JP. Should we abandon statistical modeling altogether? Am J Epidemiol
7. Gelman A, Stern H. The difference between “significant” and “not significant” is not itself statistically significant. Amer Statistician
8. Lang JM, Rothman KJ, Cann CI. That confounded P-value. Epidemiology
9. Weinberg CR. It's time to rehabilitate the P-value. Epidemiology