Estimating SNP heritability in presence of population substructure in biobank-scale datasets

Zhaotong Lin, Souvik Seal, Saonli Basu

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Single nucleotide polymorphism heritability of a trait is measured as the proportion of total variance explained by the additive effects of genome-wide single nucleotide polymorphisms. Linear mixed models are routinely used to estimate single nucleotide polymorphism heritability for many complex traits, which requires estimation of a genetic relationship matrix among individuals. Heritability is usually estimated by the restricted maximum likelihood or method of moments approaches such as Haseman–Elston regression. The common practice of accounting for such population substructure is to adjust for the top few principal components of the genetic relationship matrix as covariates in the linear mixed model. This can get computationally very intensive on large biobank-scale datasets. Here, we propose a method of moments approach for estimating single nucleotide polymorphism heritability in presence of population substructure. Our proposed method is computationally scalable on biobank datasets and gives an asymptotically unbiased estimate of heritability in presence of discrete substructures. It introduces the adjustments for population stratification in a second-order estimating equation. It allows these substructures to vary in their single nucleotide polymorphism allele frequencies and in their trait distributions (means and variances) while the heritability is assumed to be the same across these substructures. Through extensive simulation studies and the application on 7 quantitative traits in the UK Biobank cohort, we demonstrate that our proposed method performs well in the presence of population substructure and much more computationally efficient than existing approaches.

Original languageEnglish (US)
Article numberiyac015
Issue number4
StatePublished - Apr 2022

Bibliographical note

Publisher Copyright:
© The Author(s) 2022. Published by Oxford University Press on behalf of Genetics Society of America. All rights reserved. For permissions, please email: [email protected].


  • Biobank data
  • heritability
  • method of moments estimation
  • population substructure


Dive into the research topics of 'Estimating SNP heritability in presence of population substructure in biobank-scale datasets'. Together they form a unique fingerprint.

Cite this