The accuracy of LD Score regression as an estimator of confounding and genetic correlations in genome-wide association studies

James J. Lee, Matt McGue, William G. Iacono, Carson C. Chow

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

To infer that a single-nucleotide polymorphism (SNP) either affects a phenotype or is linkage disequilibrium with a causal site, we must have some assurance that any SNP-phenotype correlation is not the result of confounding with environmental variables that also affect the trait. In this study, we study the properties of linkage disequilibrium (LD) Score regression, a recently developed method for using summary statistics from genome-wide association studies to ensure that confounding does not inflate the number of false positives. We do not treat the effects of genetic variation as a random variable and thus are able to obtain results about the unbiasedness of this method. We demonstrate that LD Score regression can produce estimates of confounding at null SNPs that are unbiased or conservative under fairly general conditions. This robustness holds in the case of the parent genotype affecting the offspring phenotype through some environmental mechanism, despite the resulting correlation over SNPs between LD Scores and the degree of confounding. Additionally, we demonstrate that LD Score regression can produce reasonably robust estimates of the genetic correlation, even when its estimates of the genetic covariance and the two univariate heritabilities are substantially biased.

Original languageEnglish (US)
Pages (from-to)783-795
Number of pages13
JournalGenetic epidemiology
Volume42
Issue number8
DOIs
StatePublished - Dec 2018

Fingerprint

Genome-Wide Association Study
Linkage Disequilibrium
Single Nucleotide Polymorphism
Phenotype
Genotype
Genome

Keywords

  • causal inference
  • genetic correlation
  • heritability
  • population stratification
  • quantitative genetics

Cite this

The accuracy of LD Score regression as an estimator of confounding and genetic correlations in genome-wide association studies. / Lee, James J.; McGue, Matt; Iacono, William G.; Chow, Carson C.

In: Genetic epidemiology, Vol. 42, No. 8, 12.2018, p. 783-795.

Research output: Contribution to journalArticle

@article{9bc1f584fc324b42a0f62fb0d049f387,
title = "The accuracy of LD Score regression as an estimator of confounding and genetic correlations in genome-wide association studies",
abstract = "To infer that a single-nucleotide polymorphism (SNP) either affects a phenotype or is linkage disequilibrium with a causal site, we must have some assurance that any SNP-phenotype correlation is not the result of confounding with environmental variables that also affect the trait. In this study, we study the properties of linkage disequilibrium (LD) Score regression, a recently developed method for using summary statistics from genome-wide association studies to ensure that confounding does not inflate the number of false positives. We do not treat the effects of genetic variation as a random variable and thus are able to obtain results about the unbiasedness of this method. We demonstrate that LD Score regression can produce estimates of confounding at null SNPs that are unbiased or conservative under fairly general conditions. This robustness holds in the case of the parent genotype affecting the offspring phenotype through some environmental mechanism, despite the resulting correlation over SNPs between LD Scores and the degree of confounding. Additionally, we demonstrate that LD Score regression can produce reasonably robust estimates of the genetic correlation, even when its estimates of the genetic covariance and the two univariate heritabilities are substantially biased.",
keywords = "causal inference, genetic correlation, heritability, population stratification, quantitative genetics",
author = "Lee, {James J.} and Matt McGue and Iacono, {William G.} and Chow, {Carson C.}",
year = "2018",
month = "12",
doi = "10.1002/gepi.22161",
language = "English (US)",
volume = "42",
pages = "783--795",
journal = "Genetic Epidemiology",
issn = "0741-0395",
publisher = "Wiley-Liss Inc.",
number = "8",

}

TY - JOUR

T1 - The accuracy of LD Score regression as an estimator of confounding and genetic correlations in genome-wide association studies

AU - Lee, James J.

AU - McGue, Matt

AU - Iacono, William G.

AU - Chow, Carson C.

PY - 2018/12

Y1 - 2018/12

N2 - To infer that a single-nucleotide polymorphism (SNP) either affects a phenotype or is linkage disequilibrium with a causal site, we must have some assurance that any SNP-phenotype correlation is not the result of confounding with environmental variables that also affect the trait. In this study, we study the properties of linkage disequilibrium (LD) Score regression, a recently developed method for using summary statistics from genome-wide association studies to ensure that confounding does not inflate the number of false positives. We do not treat the effects of genetic variation as a random variable and thus are able to obtain results about the unbiasedness of this method. We demonstrate that LD Score regression can produce estimates of confounding at null SNPs that are unbiased or conservative under fairly general conditions. This robustness holds in the case of the parent genotype affecting the offspring phenotype through some environmental mechanism, despite the resulting correlation over SNPs between LD Scores and the degree of confounding. Additionally, we demonstrate that LD Score regression can produce reasonably robust estimates of the genetic correlation, even when its estimates of the genetic covariance and the two univariate heritabilities are substantially biased.

AB - To infer that a single-nucleotide polymorphism (SNP) either affects a phenotype or is linkage disequilibrium with a causal site, we must have some assurance that any SNP-phenotype correlation is not the result of confounding with environmental variables that also affect the trait. In this study, we study the properties of linkage disequilibrium (LD) Score regression, a recently developed method for using summary statistics from genome-wide association studies to ensure that confounding does not inflate the number of false positives. We do not treat the effects of genetic variation as a random variable and thus are able to obtain results about the unbiasedness of this method. We demonstrate that LD Score regression can produce estimates of confounding at null SNPs that are unbiased or conservative under fairly general conditions. This robustness holds in the case of the parent genotype affecting the offspring phenotype through some environmental mechanism, despite the resulting correlation over SNPs between LD Scores and the degree of confounding. Additionally, we demonstrate that LD Score regression can produce reasonably robust estimates of the genetic correlation, even when its estimates of the genetic covariance and the two univariate heritabilities are substantially biased.

KW - causal inference

KW - genetic correlation

KW - heritability

KW - population stratification

KW - quantitative genetics

UR - http://www.scopus.com/inward/record.url?scp=85053719114&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85053719114&partnerID=8YFLogxK

U2 - 10.1002/gepi.22161

DO - 10.1002/gepi.22161

M3 - Article

C2 - 30251275

AN - SCOPUS:85053719114

VL - 42

SP - 783

EP - 795

JO - Genetic Epidemiology

JF - Genetic Epidemiology

SN - 0741-0395

IS - 8

ER -