SwCAM: Estimation of subtype-specific expressions in individual samples with unsupervised sample-wise deconvolution

Lulu Chen, Chiung Ting Wu, Chia Hsiang Lin, Rujia Dai, Chunyu Liu, Robert Clarke, Guoqiang Yu, Jennifer E. Van Eyk, David M. Herrington, Yue Wang

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

MOTIVATION: Complex biological tissues are often a heterogeneous mixture of several molecularly distinct cell subtypes. Both subtype compositions and subtype-specific expressions can vary across biological conditions. Computational deconvolution aims to dissect patterns of bulk tissue data into subtype compositions and subtype-specific expressions. Existing deconvolution methods can only estimate averaged subtype-specific expressions in a population, while many downstream analyses such as inferring co-expression networks in particular subtypes require subtype expression estimates in individual samples. However, individual-level deconvolution is a mathematically underdetermined problem because there are more variables than observations.

RESULTS: We report a sample-wise Convex Analysis of Mixtures (swCAM) method that can estimate subtype proportions and subtype-specific expressions in individual samples from bulk tissue transcriptomes. We extend our previous CAM framework to include a new term accounting for between-sample variations and formulate swCAM as a nuclear-norm and ℓ2,1-norm regularized matrix factorization problem. We determine hyperparameter values using cross-validation with random entry exclusion and obtain a swCAM solution using an efficient alternating direction method of multipliers. Experimental results on realistic simulation data show that swCAM can accurately estimate subtype-specific expressions in individual samples and successfully extract co-expression networks in particular subtypes that are otherwise unobtainable using bulk data. In two real-world applications, swCAM analysis of bulk RNASeq data from brain tissue of cases and controls with bipolar disorder or Alzheimer's disease identified significant changes in cell proportion, expression pattern and co-expression module in patient neurons. Comparative evaluation of swCAM versus peer methods is also provided.

AVAILABILITY: The R Scripts of swCAM are freely available at https://github.com/Lululuella/swCAM. A user's guide and a vignette are provided.

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Original languageEnglish (US)
Pages (from-to)1403-1410
Number of pages8
JournalBioinformatics
Volume38
Issue number5
DOIs
StatePublished - Mar 1 2022

Bibliographical note

Publisher Copyright:
© The Author(s) 2021. Published by Oxford University Press. All rights reserved.

Keywords

  • Computer Simulation
  • Gene Expression Profiling/methods
  • Humans
  • Transcriptome

PubMed: MeSH publication types

  • Journal Article
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

Fingerprint

Dive into the research topics of 'SwCAM: Estimation of subtype-specific expressions in individual samples with unsupervised sample-wise deconvolution'. Together they form a unique fingerprint.

Cite this