Abstract
Genomic prediction has the potential to contribute to precision medicine. However, to date, the utility of such predictors is limited due to low accuracy for most traits. Here theory and simulation study are used to demonstrate that widespread pleiotropy among phenotypes can be utilised to improve genomic risk prediction. We show how a genetic predictor can be created as a weighted index that combines published genome-wide association study (GWAS) summary statistics across many different traits. We apply this framework to predict risk of schizophrenia and bipolar disorder in the Psychiatric Genomics consortium data, finding substantial heterogeneity in prediction accuracy increases across cohorts. For six additional phenotypes in the UK Biobank data, we find increases in prediction accuracy ranging from 0.7% for height to 47% for type 2 diabetes, when using a multi-trait predictor that combines published summary statistics from multiple traits, as compared to a predictor based only on one trait.
Original language | English (US) |
---|---|
Article number | 989 |
Journal | Nature communications |
Volume | 9 |
Issue number | 1 |
DOIs | |
State | Published - Dec 1 2018 |
Bibliographical note
Funding Information:The University of Queensland group is supported by the Australian Research Council (Discovery Project 160103860 and 160102400), the Australian National Health and Medical Research Council (NHMRC grants 1087889, 1080157, 1048853, 1050218, 1078901, and 1078037) and the National Institute of Health (NIH grants R21ESO25052- 01 and PO1GMO99568). J.Y. is supported by a Charles and Sylvia Viertel Senior Medical Research Fellowship. M.R.R. is supported by the University of Lausanne. We thank all the participants and researchers of the many cohort studies that make this work possible, as well as our colleagues within The University of Queensland’s Program for Complex Trait Genomics and the Queensland Brain Institute IT team for comments and suggestions and technical support. The UK Biobank research was conducted using the UK Biobank Resource under project 12514. Statistical analyses of PGC data were carried out on the Genetic Cluster Computer (http://www.geneticcluster.org) hosted by SURFsara and financially supported by the Netherlands Scientific Organization (NWO 480-05-003) along with a supplement from the Dutch Brain Foundation and the VU University Amsterdam. Numerous (>100) grants from government agencies along with substantial private and foundation support worldwide enabled the collection of phenotype and genotype data, without which this research would not be possible; grant numbers are listed in primary PGC publications. This study makes use of data from dbGaP (Accession Numbers: phs000090.v3.p1, phs000674.v2.p2, phs000021.v2.p1, phs000167.v1.p1 and phs000017.v3.p1). A full list of acknowledgements to these data sets can be found in Supplementary Note 1.
Publisher Copyright:
© 2018 The Author(s).