Abstract
Volumetric imaging features are used in cancer research to determine the size and the composition of a tumor and have been shown to be prognostic of overall survival. In this paper we focus on the analysis of tumor component proportions of brain cancer patients collected through The Cancer Genome Atlas (TCGA) project. Our main goal is to identify pathways and corresponding genes that can explain the heterogeneity of the composition of a brain tumor. In particular, we focus on the glioblastoma multiform (GBM), as it is the most common malignant brain neoplasm, accounting for 23% of all primary brain tumors for which it still has very poor prognosis. We propose a Bayesian hierarchical model for variable selection with a group structure in the context of correlated multivariate compositional response variables. More specifically, we model the proportions of the tumor components within the tumor using a Dirichlet model by allowing for straightforward incorporation of available high-dimensional covariate information within a log-linear regression framework. We impose prior distributions that account for the overlapping structure between groups of covariates. Simulations and application to GBM disease show the importance of our approach. We have identified associations between tumor component volume-based features and several impor-tant pathways and genes. Some of these genes have previously been shown to be prognostic indicators of overall survival time in GBM.
Original language | English (US) |
---|---|
Pages (from-to) | 3013-3034 |
Number of pages | 22 |
Journal | Annals of Applied Statistics |
Volume | 17 |
Issue number | 4 |
DOIs | |
State | Published - Dec 2023 |
Bibliographical note
Publisher Copyright:© Institute of Mathematical Statistics, 2023.
Keywords
- Bayesian hierarchical model
- Dirichlet regression
- Glioblastoma
- group selection
- overlap