Abstract
We propose a constrained maximum partial likelihood estimator for dimension reduction in integrative (e.g., pan-cancer) survival analysis with high-dimensional predictors. We assume that for each population in the study, the hazard function follows a distinct Cox proportional hazards model. To borrow information across populations, we assume that each of the hazard functions depend only on a small number of linear combinations of the predictors (i.e., “factors”). We estimate these linear combinations using an algorithm based on “distance-to-set” penalties. This allows us to impose both low-rankness and sparsity on the regression coefficient matrix estimator. We derive asymptotic results that reveal that our estimator is more efficient than fitting a separate proportional hazards model for each population. Numerical experiments suggest that our method outperforms competitors under various data generating models. We use our method to perform a pan-cancer survival analysis relating protein expression to survival across 18 distinct cancer types. Our approach identifies six linear combinations, depending on only 20 proteins, which explain survival across the cancer types. Finally, to validate our fitted model, we show that our estimated factors can lead to better prediction than competitors on four external datasets.
Original language | English (US) |
---|---|
Pages (from-to) | 1610-1623 |
Number of pages | 14 |
Journal | Biometrics |
Volume | 79 |
Issue number | 3 |
DOIs | |
State | Published - Sep 2023 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2022 The International Biometric Society.
Keywords
- Cox proportional hazards model
- dimension reduction
- integrative survival analysis
- majorize-minimize
- penalty method
- reduced-rank regression
- variable selection
PubMed: MeSH publication types
- Journal Article
- Research Support, U.S. Gov't, Non-P.H.S.