APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data

Naima Ahmed Fahmi, Khandakar Tanvir Ahmed, Jae Woong Chang, Heba Nassereddeen, Deliang Fan, Jeongsik Yong, Wei Zhang

Research output: Contribution to journalArticlepeer-review

Abstract

Background: The eukaryotic genome is capable of producing multiple isoforms from a gene by alternative polyadenylation (APA) during pre-mRNA processing. APA in the 3′-untranslated region (3′-UTR) of mRNA produces transcripts with shorter or longer 3′-UTR. Often, 3′-UTR serves as a binding platform for microRNAs and RNA-binding proteins, which affect the fate of the mRNA transcript. Thus, 3′-UTR APA is known to modulate translation and provides a mean to regulate gene expression at the post-transcriptional level. Current bioinformatics pipelines have limited capability in profiling 3′-UTR APA events due to incomplete annotations and a low-resolution analyzing power: widely available bioinformatics pipelines do not reference actionable polyadenylation (cleavage) sites but simulate 3′-UTR APA only using RNA-seq read coverage, causing false positive identifications. To overcome these limitations, we developed APA-Scan, a robust program that identifies 3′-UTR APA events and visualizes the RNA-seq short-read coverage with gene annotations. Methods: APA-Scan utilizes either predicted or experimentally validated actionable polyadenylation signals as a reference for polyadenylation sites and calculates the quantity of long and short 3′-UTR transcripts in the RNA-seq data. APA-Scan works in three major steps: (i) calculate the read coverage of the 3′-UTR regions of genes; (ii) identify the potential APA sites and evaluate the significance of the events among two biological conditions; (iii) graphical representation of user specific event with 3′-UTR annotation and read coverage on the 3′-UTR regions. APA-Scan is implemented in Python3. Source code and a comprehensive user’s manual are freely available at https://github.com/compbiolabucf/APA-Scan. Result: APA-Scan was applied to both simulated and real RNA-seq datasets and compared with two widely used baselines DaPars and APAtrap. In simulation APA-Scan significantly improved the accuracy of 3′-UTR APA identification compared to the other baselines. The performance of APA-Scan was also validated by 3′-end-seq data and qPCR on mouse embryonic fibroblast cells. The experiments confirm that APA-Scan can detect unannotated 3′-UTR APA events and improve genome annotation. Conclusion: APA-Scan is a comprehensive computational pipeline to detect transcriptome-wide 3′-UTR APA events. The pipeline integrates both RNA-seq and 3′-end-seq data information and can efficiently identify the significant events with a high-resolution short reads coverage plots.

Original languageEnglish (US)
Article number396
JournalBMC bioinformatics
Volume23
Issue numberSuppl 3
DOIs
StatePublished - Mar 2022

Bibliographical note

Funding Information:
The study was supported by the National Science Foundation grant FET2003749 and National Institutes of Health 1R01GM113952-01A1 and DK097771. Publication costs are funded by the National Science Foundation grant FET2003749. The funding bodies had no role in study design, data collection, data analysis and interpretation of data and in writing the manuscript.

Publisher Copyright:
© 2022, The Author(s).

Keywords

  • 3′-End-seq
  • Alternative polyadenylation
  • RNA-seq
  • Transcriptome
  • 3' Untranslated Regions/genetics
  • RNA Precursors/metabolism
  • Protein Isoforms/genetics
  • RNA, Messenger/genetics
  • Animals
  • Fibroblasts/metabolism
  • RNA-Seq
  • Polyadenylation
  • Mice
  • MicroRNAs/metabolism

PubMed: MeSH publication types

  • Journal Article

Fingerprint

Dive into the research topics of 'APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data'. Together they form a unique fingerprint.

Cite this