AbSNP: RNA-Seq SNP calling in repetitive regions via abundance estimation

Shunfu Mao, Soheil Mohajer, Kannan Ramachandran, David Tse, Sreeram Kannan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Variant calling, in particular, calling SNPs (Single Nucleotide Polymorphisms) is a fundamental task in genomics. While existing packages offer excellent performance on calling SNPs which have uniquely mapped reads, they suffer in loci where the reads are multiply mapped, and are unable to make any reliable calls. Variants in multiply mapped loci can arise, for example in long segmental duplications, and can play important role in evolution and disease. In this paper, we develop a new SNP caller named abSNP, which offers three innovations. (a) abSNP calls SNPs from RNA-Seq data. Since RNA-Seq data is primarily sampled from gene regions, this method is inexpensive. (b) abSNP is able to successfully make calls on repetitive gene regions by exploiting the quality scores of multiply mapped reads carefully in order to make variant calls. (c) abSNP exploits a specific feature of RNA-Seq data, namely the varying abundance of different genes, in order to identify which repetitive copy a particular read is sampled from. We demonstrate that the proposed method offers significant performance gains on repetitive regions in simulated data. In particular, the algorithm is able to achieve near-perfect sensitivity on high-coverage SNPs, even when multiply mapped.

Original languageEnglish (US)
Title of host publication17th International Workshop on Algorithms in Bioinformatics, WABI 2017
EditorsKnut Reinert, Russell Schwartz
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
ISBN (Electronic)9783959770507
DOIs
StatePublished - Aug 1 2017
Event17th International Workshop on Algorithms in Bioinformatics, WABI 2017 - Boston, United States
Duration: Aug 21 2017Aug 23 2017

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume88
ISSN (Print)1868-8969

Other

Other17th International Workshop on Algorithms in Bioinformatics, WABI 2017
CountryUnited States
CityBoston
Period8/21/178/23/17

    Fingerprint

Keywords

  • Abundance Estimation
  • Multiply Mapped Reads
  • RNA-Seq
  • Repetitive Region
  • SNP Calling

Cite this

Mao, S., Mohajer, S., Ramachandran, K., Tse, D., & Kannan, S. (2017). AbSNP: RNA-Seq SNP calling in repetitive regions via abundance estimation. In K. Reinert, & R. Schwartz (Eds.), 17th International Workshop on Algorithms in Bioinformatics, WABI 2017 [15] (Leibniz International Proceedings in Informatics, LIPIcs; Vol. 88). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. https://doi.org/10.4230/LIPIcs.WABI.2017.15