Sample Size Calculations for Additive Interactions

Lee, Wen-Chung

doi: 10.1097/EDE.0b013e31829ef812
Author Information

Research Center for Genes, Environment and Human Health, Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan,

This article is partly supported by grants from the National Science Council, Taiwan.

Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article ( This content is not peer-reviewed or copy-edited; it is the sole responsibility of the author.

Article Outline
Back to Top | Article Outline

To the Editor:

There has been a growing interest in assessing interactions on the additive scale using the index of “relative excess risk due to interaction” (RERI).1 For two dichotomous factors, RERI is defined by RR11 - RR11 - RR01 + 1, where RRij is the risk ratio comparing 1st factor = i and 2nd factor = j with both factors = 0. RERI is related to the potential outcome and the sufficient cause models2; RERI = 0 implies no interaction on the additive scale; RERI > 1 implies “sufficient cause interaction” (there are persons for whom the outcome would occur if both factors are present but not if just one factor is present); and RERI > 2 implies “singular interaction” (there are persons for whom the outcome would occur if and only if both factors are present).

Relying on standard asymptotic (large sample) theory, VanderWeele3 recently presented a spreadsheet program to calculate sample sizes for additive interactions. Still, the sampling distribution of RERI is difficult to approximate with a small-to-moderate sample size.4–8 For better accuracy, I propose brute-force Monte-Carlo simulations for sample size calculations.

SAS codes are presented in the eAppendix. For cohort studies, one can specify a total of ten parameters (three parameters related to the study population, four parameters related to risks, and three parameters related to the hypothesis testing to be conducted). The three population parameters are the prevalence of the 1st factor, the prevalence of the 2nd factor, and the odds ratio between the two factors (1.0, if the two factors are independent to each other). The 4 risk parameters are: (1) the background disease risk (the disease risk for those exposed to neither factor), (2) the risk ratio comparing those exposed to the 1st but not the 2nd factor with those exposed to neither, (3) the risk ratio comparing those exposed to the 2nd but not the 1st factor with those exposed to neither, and (4) the risk ratio comparing those exposed to both factors with those exposed to neither. The three test parameters are the level of significance (α level), the target power, and the threshold of the test (zero for additive interaction, one for sufficient cause interaction, and two for singular interaction).

For case-control studies, the rare disease assumption is invoked. One can also specify a total of ten parameters. The three population parameters and the three test parameters are the same as those in cohort studies. There is no background disease risk now; one can specify the control-to-case matching ratio instead. As for the other 3 risk parameters, they are now in terms of odds ratios.

As an initial value, the program first calculates a sample size (N0) using the asymptotic method.3 The program then performs a bisection search between 0 and 2N0, until the simulated power is within 0.005 of the target power. The simulation is done 100,000 times for each candidate sample size. The programs can generate a sample size in less than 1 minute using an ordinary personal computer.

The Table presents the number of subjects required in a cohort study and the number of diseased subjects required in a case-control study to achieve 80% power at α = 0.05. As can be seen, the asymptotic method3 consistently overestimates the required sample sizes. The overestimation can be up to 40%. Shown in parenthesis in the Table are the empirical powers (Monte-Carlo simulation) for the sample sizes calculated from the asymptotic method.3 These values overshoot the target power of 80% considerably, sometimes reaching a very high power value of 95%.

Wen-Chung Lee

Research Center for Genes, Environment and Human Health

Institute of Epidemiology and Preventive Medicine

College of Public Health

National Taiwan University

Taipei, Taiwan

Back to Top | Article Outline


1. Rothman KJ Modern Epidemiology. 19861st ed Boston, MA Little, Brown & Co
2. VanderWeele TJ, Knol MJ. Remarks on antagonism. Am J Epidemiol. 2011;173:1140–1147
3. VanderWeele TJ. Sample size and power calculation for additive interactions. Epidemiol Meth. 2012;1:8
4. Zou GY. On the estimation of additive interaction by use of the four-by-two table and beyond. Am J Epidemiol. 2008;168:212–224
5. Richardson DB, Kaufman JS. Estimation of the relative excess risk due to interaction and associated confidence bounds. Am J Epidemiol. 2009;169:756–760
6. Nie L, Chu H, Li F, Cole SR. Relative excess risk due to interaction: resampling-based confidence intervals. Epidemiology. 2010;21:552–556
7. Chu H, Nie L, Cole SR. Estimating the relative excess risk due to interaction: a Bayesian approach. Epidemiology. 2011;22:242–248
8. VanderWeele TJ, Vansteelandt S. A weighting approach to causal effects and additive interaction in case-control studies: marginal structural linear odds models. Am J Epidemiol. 2011;174:1197–1203
© 2013 by Lippincott Williams & Wilkins, Inc