Robust Discovery of Genetic Associations Incorporating Gene-Environment Interaction and Independence

Tchetgen Tchetgen, Eric*†

Genetics: Original Article

This article considers the detection and evaluation of genetic effects incorporating gene-environment interaction and independence. Whereas ordinary logistic regression cannot exploit the assumption of gene-environment independence, the proposed approach makes explicit use of the independence assumption to improve estimation efficiency. This method, which uses both cases and controls, fits a constrained retrospective regression in which the genetic variant plays the role of the response variable, and the disease indicator and the environmental exposure are the independent variables. The regression model constrains the association of the environmental exposure with the genetic variant among the controls to be null, thus explicitly encoding the gene-environment independence assumption, which yields substantial gain in accuracy in the evaluation of genetic effects. The proposed retrospective regression approach has several advantages. It is easy to implement with standard software, and it readily accounts for multiple environmental exposures of a polytomous or of a continuous nature, while easily incorporating extraneous covariates. Unlike the profile likelihood approach of Chatterjee and Carroll (Biometrika. 2005;92:399–418), the proposed method does not require a model for the association of a polytomous or continuous exposure with the disease outcome, and, therefore, it is agnostic to the functional form of such a model and completely robust to its possible misspecification.

From the Departments of *Epidemiology and †Biostatistics, Harvard University, Boston, MA.

Submitted 6 May 2010; accepted 2 November 2010; posted 12 January 2011.

Correspondence: Eric J. Tchetgen Tchetgen, Department of Epidemiology, Harvard School of Public Health, 677 Huntington Ave, Boston, MA 02115. E-mail:

