To the Editor:
Mendelian randomization (MR) is an application of instrumental variable (IV) analysis, in which genetic variants robustly related to an exposure of interest are used to infer causal associations free from confounding and reverse causality.1,2 Multivariable MR is a method that can be used to estimate the effect of two or more exposures on an outcome.3,4
It is often of interest to explore whether the effect of one exposure depends on another exposure, i.e., interaction.5 A recent study used genetic IVs to apply a factorial design akin to a factorial randomized controlled trial (RCT) in which participants are randomly allocated to one of four treatment regimens (treatment 1, treatment 2, both treatments, or neither treatment). Genetic risk scores for each exposure were split at the median, and the resultant four groups compared.6
Using simulation, we test an extension to multivariable MR using two-stage least squares to estimate the additive interaction between two continuous exposures on a continuous outcome, including scenarios where one exposure has a causal effect on the other (mediation). We use genetic risk scores for each exposure (Z1 and Z2) and the product of the two genetic risk scores, i.e., Z = (Z1, Z2, Z1Z2) as the IV (Z). When the first exposure has a causal effect on the second exposure, i.e., exposure 2 mediates the effect of exposure 1 on the outcome, the instrument is Z = (Z1, Z2, Z1Z2, Z1Z1). The interaction parameters were set to one third of the main effect’s parameters to impose a limit on the variance explained by interaction terms. We compare the performance of the two-stage least squares estimator to a factorial MR design, in which genetic risk scores for each exposure are dichotomized to create four groups. Full methodological details are in the eAppendix; http://links.lww.com/EDE/B583.
Our simulations demonstrate that factorial MR has very low statistical power, 5%–7% at N = 50,000 and 8%–23% at N = 500,000 across the range of parameters tested (Figure). The two-stage least squares estimators had higher power to detect interactions than factorial MR and lower type I error. For N = 500,000, the two-stage least squares estimator had power ranging from 29.7% to 92.9% and type I error ranging from 4% to 6% (Figure). In comparison, power at N = 50,000 was 7%–55%.
Ordinary least squares estimation demonstrated considerable bias in estimating the interaction term. The two-stage least squares estimates were markedly improved, but there was still some bias for some parameter combinations at N = 50,000 and N = 100,000 for both Z = (Z1, Z2, Z1Z2) and Z = (Z1, Z2, Z1Z2, Z1Z1). At N = 50,000, two-stage least squares using Z = (Z1, Z2, Z1Z2, Z1Z1) produced coefficient estimates that were generally closer to the true value and had a smaller standard error and mean standard error compared with using Z = (Z1, Z2, Z1Z2). At N = 500,000, two-stage least squares estimates were very close to the true parameter values, with minimal differences between Z = (Z1, Z2, Z1Z2, Z1Z1) and Z = (Z1, Z2, Z1Z2). Full analysis of bias, coverage, power, and type I error are presented in the eAppendix; http://links.lww.com/EDE/B583, as is an illustrative example.
Factorial MR has very low power to detect interactions.7 In contrast, an extension to multivariable MR3,4 using genetic risk scores for each exposure and their product as the instrument in two-stage least squares had greater power, was generally unbiased, with reasonable coverage and type I error, but required large sample sizes and strong IVs. Factorial MR has an implicit simplicity, which may be attractive to clinical or nonstatistical audiences, but our results suggest that in most cases, the two-stage least squares approach will be preferable. Our simulations were limited to a continuous outcome and interactions on the additive scale; further work is required to test and develop approaches for binary outcomes and multiplicative interactions. Our approach uses individual participant data; assessing interactions within two-sample MR would be challenging.
The authors would like to thank Dr Rhian Daniel for helpful comments during the earlier stages of the analysis. All authors contributed to study design, analysis, interpretation of results, and critically revised the manuscript. T.-L.N. performed the majority of the analysis and wrote the first draft of the manuscript.
Neil M. Davies
Alice R. Carter
Laura D. Howe
MRC Integrative Epidemiology Unit, Population Health Sciences, University of Bristol, Bristol, United Kingdom, [email protected]
1. Lawlor DA, Harbord RM, Sterne JA, Timpson N, Davey Smith G. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat Med. 2008;27:1133–1163.
2. Hernán MA, Robins JM. Instruments for causal inference: an epidemiologist’s dream? Epidemiology. 2006;17:360–372.
3. Sanderson E, Davey Smith G, Windmeijer F, Bowden J. An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings. Int J Epidemiol. 2018;48:713–727.
4. Burgess S, Thompson SG. Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am J Epidemiol. 2015;181:251–260.
5. VanderWeele TJ. A unification of mediation and interaction: a 4-way decomposition. Epidemiology. 2014;25:749–761.
6. Ference BA, Majeed F, Penumetcha R, Flack JM, Brook RD. Effect of naturally random allocation to lower low-density lipoprotein cholesterol on the risk of coronary heart disease mediated by polymorphisms in NPC1L1, HMGCR, or both: a 2 × 2 factorial Mendelian randomization study. J Am Coll Cardiol. 2015;65:1552–1561.
7. Rees JMB, Foley CN, Burgess S. Factorial Mendelian randomization: using genetic variants to assess interactions. Int J Epidemiol. dyz 161, https://doi.org/10.1093/ije/dyz161
[epub ahead of print].