Long non-coding RNA transcriptome of uncharacterized samples can be accurately imputed using protein-coding genes

Aritro Nath, Paul Geeleher, R. Stephanie Huang

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Long non-coding RNAs (lncRNAs) play an important role in gene regulation and are increasingly being recognized as crucial mediators of disease pathogenesis. However, the vast majority of published transcriptome datasets lack high-quality lncRNA profiles compared to protein-coding genes (PCGs). Here we propose a framework to harnesses the correlative expression patterns between lncRNA and PCGs to impute unknown lncRNA profiles. The lncRNA expression imputation (LEXI) framework enables characterization of lncRNA transcriptome of samples lacking any lncRNA data using only their PCG profiles. We compare various machine learning and missing value imputation algorithms to implement LEXI and demonstrate the feasibility of this approach to impute lncRNA transcriptome of normal and cancer tissues. Additionally, we determine the factors that influence imputation accuracy and provide guidelines for implementing this approach.

Original languageEnglish (US)
Pages (from-to)637-648
Number of pages12
JournalBriefings in Bioinformatics
Volume21
Issue number2
DOIs
StatePublished - Mar 23 2020

Bibliographical note

Funding Information:
NIH/NCI (grant 1R01CA204856-01A1 and grant R21 CA139278 to R.S.H.),Avon Foundation forWomen (to R.S.H.),NIH/NIGMS (grant K08GM089941 and grant UO1GM61393 to R.S.H.), Circle of Service Foundation Early Career Investigator award (to R.S.H.) and Chicago Biomedical Consortium (grant PDR-020 to P.G.). The funding agencies did not participate in the design of the study, nor do they have any influence on the collection, analysis, and interpretation of data or in writing the manuscript.

Funding Information:
NIH/NCI (grant 1R01CA204856-01A1 and grant R21 CA139278 to R.S.H.), Avon Foundation for Women (to R.S.H.), NIH/NIGMS (grant K08GM089941 and grant UO1GM61393 to R.S.H.), Circle of Service Foundation Early Career Investigator award (to R.S.H.) and Chicago Biomedical Consortium (grant PDR-020 to P.G.). The funding agencies did not participate in the design of the study, nor do they have any influence on the collection, analysis, and interpretation of data or in writing the manuscript.

Publisher Copyright:
© 2019 The Author(s) 2019. Published by Oxford University Press. All rights reserved.

Keywords

  • GTEX
  • TCGA
  • expression
  • imputation
  • lncRNA
  • machine learning

Fingerprint Dive into the research topics of 'Long non-coding RNA transcriptome of uncharacterized samples can be accurately imputed using protein-coding genes'. Together they form a unique fingerprint.

Cite this