It is of great scientific interest to identify interactions between genetic variants and environmental exposures that may modify the risk of complex diseases. However, larger sample sizes are usually required to detect gene-by-environment interaction (G × E) than required to detect genetic main association effects. To boost the statistical power and improve the understanding of the underlying molecular mechanisms, we incorporate functional genomics information, specifically, expression quantitative trait loci (eQTLs), into a data-adaptive G × E test, called aGEw. This test adaptively chooses the best eQTL weights from multiple tissues and provides an extra layer of weighting at the genetic variant level. Extensive simulations show that the aGEw test can control the Type 1 error rate, and the power is resilient to the inclusion of neutral variants and noninformative external weights. We applied the proposed aGEw test to the Pancreatic Cancer Case–Control Consortium (discovery cohort of 3,585 cases and 3,482 controls) and the PanScan II genome-wide association study data (replication cohort of 2,021 cases and 2,105 controls) with smoking as the exposure of interest. Two novel putative smoking-related pancreatic cancer susceptibility genes, TRIP10 and KDM3A, were identified. The aGEw test is implemented in an R package aGE.
Bibliographical noteFunding Information:
We thank the two anonymous reviewers for their constructive comments. This study was supported by the National Institutes of Health (NIH) grant R01CA169122; P. W. was supported by NIH grants R01HL116720 and R21HL126032. S. H. O. was supported by NIH grant P30CA008748. R. E. N. and the Queensland Pancreatic Cancer Study were funded by the Australian National Health and Medical Research Council. The PanC4 genome-wide association study was supported by NIH grant R01CA154823. The authors thank Dr. Alison Klein for help with harmonization of the exposure variable, and Ms. Jessica Swann and the National Institute of Statistical Sciences writing workshop for editorial assistance and suggestions. The authors acknowledge the Texas Advanced Computing Center at The University of Texas at Austin for providing computing resources. The authors alone are responsible for the views expressed in this article and they do not necessarily represent the views, decisions or policies of the institutions with which they are affiliated.
- data-adaptive association testing
- gene-by-environment interaction
- multiple functional weights