Identifying rare variants inconsistent with identity-by-descent in population-scale whole-genome sequencing data

Kelsey E. Johnson, Christopher J. Adams, Benjamin F. Voight

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Analyses of genetic variation typically assume that rare variants within a population are inherited from a single common ancestral event identity-by-descent (IBD). However, there are genetic and technical processes through which rare variants in population genetic data may deviate from this simple evolutionary model, including recurrent mutations, gene conversions and genotyping error. All these processes can decrease the expected length of shared background haplotype surrounding a rare variant if that variant was inherited from a single event descending from a common ancestor. No method exists to computationally infer rare variants inconsistent with this simple model—denoted here as ‘IBD-inconsistent’—using unphased population sequencing data. We hypothesized that the difference in shared haplotype background length can distinguish variants consistent and inconsistent with this simple IBD transmission population sequencing data without pedigree information. We implemented a Bayesian hierarchical model and used Gibbs sampling to estimate the posterior probability of IBD state for rare variants, using simulated recurrent mutations to demonstrate that our approach accurately distinguishes rare variants consistent and inconsistent with a simple IBD inheritance model. Applying our method to whole-genome sequencing data from 3,621 human individuals in the UK10K consortium, we found that IBD-inconsistent variants correlated with higher local mutation rates and genomic features like replication timing. Using a heuristic to categorize IBD-inconsistent variants as gene conversions, we found that potential gene conversions had expected properties such as enriched local GC content. By identifying IBD-inconsistent variants, we can better understand the spectrum of recent mutations in human populations, a source of genetic variation driving evolution and a key factor in understanding recent demographic history.

Original languageEnglish (US)
Pages (from-to)2429-2442
Number of pages14
JournalMethods in Ecology and Evolution
Volume13
Issue number11
DOIs
StatePublished - Nov 2022
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2022 The Authors. Methods in Ecology and Evolution published by John Wiley & Sons Ltd on behalf of British Ecological Society.

Keywords

  • Bayesian methods
  • bioinformatics
  • evolutionary biology
  • molecular evolution
  • population genetics

Fingerprint

Dive into the research topics of 'Identifying rare variants inconsistent with identity-by-descent in population-scale whole-genome sequencing data'. Together they form a unique fingerprint.

Cite this