Quantifying the extent to which index event biases influence large genetic association studies

Hanieh Yaghootkar, Michael P. Bancks, Sam E. Jones, Aaron McDaid, Robin Beaumont, Louise Donnelly, Andrew R. Wood, Archie Campbell, Jessica Tyrrell, Lynne J. Hocking, Marcus A. Tuke, Katherine S. Ruth, Ewan R. Pearson, Anna Murray, Rachel M. Freathy, Patricia B. Munroe, Caroline Hayward, Colin Palmer, Michael N. Weedon, James S. PankowTimothy M. Frayling, Zoltan Kutalik

Research output: Contribution to journalArticlepeer-review

30 Scopus citations


Studies attempting to functionally interpret complex-disease susceptibility loci by GWAS and eQTL integration have predominantly employed microarrays to quantify gene-expression. RNA-Seq has the potential to discover a more comprehensive set of eQTLs and illuminate the underlying molecular consequence. We examine the functional outcome of 39 variants associated with Systemic Lupus Erythematosus (SLE) through the integration of GWAS and eQTL data from the TwinsUK microarray and RNA-Seq cohort in lymphoblastoid cell lines. We use conditional analysis and a Bayesian colocalisation method to provide evidence of a shared causal-variant, then compare the ability of each quantification type to detect disease relevant eQTLs and eGenes. We discovered the greatest frequency of candidate-causal eQTLs using exon-level RNA-Seq, and identified novel SLE susceptibility genes (e.g. NADSYN1 and TCF7) that were concealed using microarrays, including four non-coding RNAs. Many of these eQTLs were found to influence the expression of several genes, supporting the notion that risk haplotypes may harbour multiple functional effects. Novel SLE associated splicing events were identified in the T-reg restricted transcription factor, IKZF2, and other candidate genes (e.g. WDFY4) through asQTL mapping using the Geuvadis cohort. We have significantly increased our understanding of the genetic control of gene-expression in SLE by maximising the leverage of RNA-Seq and performing integrative GWAS-eQTL analysis against gene, exon, and splice-junction quantifications. We conclude that to better understand the true functional consequence of regulatory variants, quantification by RNA-Seq should be performed at the exon-level as a minimum, and run in parallel with gene and splice-junction level quantification.

Original languageEnglish (US)
Pages (from-to)1003-1017
Number of pages15
JournalHuman molecular genetics
Issue number5
StatePublished - Mar 1 2017

Bibliographical note

Funding Information:
This research has been conducted using the UK Biobank Resource. The authors thank University of Exeter Medical School. EXTEND data were provided by the Peninsula Research Bank, part of the NIHR Exeter Clinical Research Facility. P.B.M. wishes to acknowledge support from the NIHR Cardiovascular Biomedical Research Unit at Barts and The London, Queen Mary University of London, UK. We are grateful to all the participants who took part in the GS:SFHS study, to the general practitioners, to the Scottish School of Primary Care for their help in recruiting the participants, and to the whole team, which includes interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists, and nurses. The Wellcome Trust provides support for Wellcome Trust United Kingdom Type 2 Diabetes Case Control Collection (GoDARTS) and informatics support is provided by the Chief Scientist Office. The Atherosclerosis Risk in Communities Study (ARIC) is carried out as a collaborative study supported by National Heart, Lung, and Blood Institute contracts (HHSN268201100005C, HHSN268201100006C, HHSN26820110000 7C, HHSN268201100008C, HHSN268201100009C, HHSN268201100 010C, HHSN268201100011C and HHSN268201100012C), R01HL 087641, R01HL59367 and R01HL086694; National Human Genome Research Institute contract U01HG004402; and National Institutes of Health contract HHSN268200625226C. The authors thank the staffand participants of the ARIC study for their important contributions. Infrastructure was partly supported by Grant Number UL1RR025005, a component of the National Institutes of Health and NIH Roadmap for Medical Research. H.Y., A.R.W. and T.M.F. are supported by the European Research Council grant: 323195; SZ-245 50371-GLUCOSEGENES-FP7-IDEAS-ERC. S.E.J. is funded by the Medical Research Council (grant: MR/M005070/1). M.A.T., M.N.W. and A.M. are supported by the Wellcome Trust Institutional Strategic Support Award (WT097835MF). R.M.F. is a Sir Henry Dale Fellow (Wellcome Trust and Royal Society grant: 104150/Z/14/Z). R.B. is funded by the Wellcome Trust and Royal Society grant: 104150/Z/14/Z. J.T. is funded by a Diabetes Research and Wellness Foundation Fellowship. Z.K. received financial support from the Leenaards Foundation, the Swiss Institute of Bioinformatics and the Swiss National Science Foundation (31003A-143914) and SystemsX.ch (39). The work of M.P.B was supported by the National Heart, Lung, And Blood Institute of the National Institutes of Health under Award no. T32HL007779. Generation Scotland received core support from the Chief Scientist Office of the Scottish Government Health Directorates [CZD/16/6] and the Scottish Funding Council [HR03006]. E.R.P. holds a WT New investigator award 102820/Z/13/Z. We would like to thank George Davey Smith, Mark McCarthy and Joel Hirschhorn for helpful comments on the manuscript.

Publisher Copyright:
© The Author 2017. Published by Oxford University Press. All rights reserved.


Dive into the research topics of 'Quantifying the extent to which index event biases influence large genetic association studies'. Together they form a unique fingerprint.

Cite this