A hierarchical spike-and-slab model for pan-cancer survival using pan-omic data

Sarah Samorodnitsky, Katherine A. Hoadley, Eric F. Lock

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Background: Pan-omics, pan-cancer analysis has advanced our understanding of the molecular heterogeneity of cancer. However, such analyses have been limited in their ability to use information from multiple sources of data (e.g., omics platforms) and multiple sample sets (e.g., cancer types) to predict clinical outcomes. We address the issue of prediction across multiple high-dimensional sources of data and sample sets by using molecular patterns identified by BIDIFAC+, a method for integrative dimension reduction of bidimensionally-linked matrices, in a Bayesian hierarchical model. Our model performs variable selection through spike-and-slab priors that borrow information across clustered data. We use this model to predict overall patient survival from the Cancer Genome Atlas with data from 29 cancer types and 4 omics sources and use simulations to characterize the performance of the hierarchical spike-and-slab prior. Results: We found that molecular patterns shared across all or most cancers were largely not predictive of survival. However, our model selected patterns unique to subsets of cancers that differentiate clinical tumor subtypes with markedly different survival outcomes. Some of these subtypes were previously established, such as subtypes of uterine corpus endometrial carcinoma, while others may be novel, such as subtypes within a set of kidney carcinomas. Through simulations, we found that the hierarchical spike-and-slab prior performs best in terms of variable selection accuracy and predictive power when borrowing information is advantageous, but also offers competitive performance when it is not. Conclusions: We address the issue of prediction across multiple sources of data by using results from BIDIFAC+ in a Bayesian hierarchical model for overall patient survival. By incorporating spike-and-slab priors that borrow information across cancers, we identified molecular patterns that distinguish clinical tumor subtypes within a single cancer and within a group of cancers. We also corroborate the flexibility and performance of using spike-and-slab priors as a Bayesian variable selection approach.

Original languageEnglish (US)
Article number235
JournalBMC bioinformatics
Volume23
Issue number1
DOIs
StatePublished - Dec 2022

Bibliographical note

Funding Information:
This work was supported by the National Institutes of Health (NIH) National Cancer Institute (NCI) grant R21CA231214, and National Institute of General Medical Sciences (NIGMS) grant R01GM130622.

Publisher Copyright:
© 2022, The Author(s).

Keywords

  • Bayesian hierarchical modeling
  • Bidimensionally-linked matrices
  • Pan-omics
  • Spike-and-slab priors
  • The Cancer Genome Atlas (TCGA)
  • pan-cancer
  • survival analysis

Fingerprint

Dive into the research topics of 'A hierarchical spike-and-slab model for pan-cancer survival using pan-omic data'. Together they form a unique fingerprint.

Cite this