BAMITA: Bayesian multiple imputation for tensor arrays

Ziren Jiang, Gen Li, Eric F. Lock

Research output: Contribution to journalArticlepeer-review

Abstract

Data increasingly take the form of a multi-way array, or tensor, in several biomedical domains. Such tensors are often incompletely observed. For example, we are motivated by longitudinal microbiome studies in which several timepoints are missing for several subjects. There is a growing literature on missing data imputation for tensors. However, existing methods give a point estimate for missing values without capturing uncertainty. We propose a multiple imputation approach for tensors in a flexible Bayesian framework, that yields realistic simulated values for missing entries and can propagate uncertainty through subsequent analyses. Our model uses efficient and widely applicable conjugate priors for a CANDECOMP/PARAFAC (CP) factorization, with a separable residual covariance structure. This approach is shown to perform well with respect to both imputation accuracy and uncertainty calibration, for scenarios in which either single entries or entire fibers of the tensor are missing. For two microbiome applications, it is shown to accurately capture uncertainty in the full microbiome profile at missing timepoints and used to infer trends in species diversity for the population.

Original languageEnglish (US)
Article numberkxae047
JournalBiostatistics
Volume26
Issue number1
DOIs
StatePublished - 2025

Bibliographical note

Publisher Copyright:
© 2024 The Author. Published by Oxford University Press. All rights reserved.

Keywords

  • Bayesian inference
  • microbiome data
  • missing data
  • multiple imputation
  • multiway data

PubMed: MeSH publication types

  • Journal Article

Fingerprint

Dive into the research topics of 'BAMITA: Bayesian multiple imputation for tensor arrays'. Together they form a unique fingerprint.

Cite this