Zero-inflated Poisson factor model with application to microbiome read counts

Tianchen Xu, Ryan T. Demmer, Gen Li

Research output: Contribution to journalArticlepeer-review

Abstract

Dimension reduction of high-dimensional microbiome data facilitates subsequent analysis such as regression and clustering. Most existing reduction methods cannot fully accommodate the special features of the data such as count-valued and excessive zero reads. We propose a zero-inflated Poisson factor analysis model in this paper. The model assumes that microbiome read counts follow zero-inflated Poisson distributions with library size as offset and Poisson rates negatively related to the inflated zero occurrences. The latent parameters of the model form a low-rank matrix consisting of interpretable loadings and low-dimensional scores that can be used for further analyses. We develop an efficient and robust expectation-maximization algorithm for parameter estimation. We demonstrate the efficacy of the proposed method using comprehensive simulation studies. The application to the Oral Infections, Glucose Intolerance, and Insulin Resistance Study provides valuable insights into the relation between subgingival microbiome and periodontal disease.

Original languageEnglish (US)
Pages (from-to)91-101
Number of pages11
JournalBiometrics
Volume77
Issue number1
DOIs
StatePublished - Mar 2021

Bibliographical note

Funding Information:
Research reported in this publication was supported by the National Institute of Dental & Craniofacial Research of the National Institutes of Health under award number R03DE027773.

Publisher Copyright:
© 2020 The International Biometric Society

Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.

Keywords

  • 16S sequencing
  • factor analysis
  • low rank
  • microbiome data
  • zero inflation

Fingerprint Dive into the research topics of 'Zero-inflated Poisson factor model with application to microbiome read counts'. Together they form a unique fingerprint.

Cite this