Abstract
Penalized regression approaches are attractive in dealing with high-dimensional data such as arising in high-throughput genomic studies. New methods have been introduced to utilize the network structure of predictors, for example, gene networks, to improve parameter estimation and variable selection. All the existing network-based penalized methods are based on an assumption that parameters, for example, regression coefficients, of neighboring nodes in a network are close in magnitude, which however may not hold. Here we propose a novel penalized regression method based on a weaker prior assumption that the parameters of neighboring nodes in a network are likely to be zero (or non-zero) at the same time, regardless of their specific magnitudes. We propose a novel non-convex penalty function to incorporate this prior, and an algorithm based on difference convex programming. We use simulated data and two breast cancer gene expression datasets to demonstrate the advantages of the proposed methods over some existing methods. Our proposed methods can be applied to more general problems for group variable selection.
Original language | English (US) |
---|---|
Pages (from-to) | 582-593 |
Number of pages | 12 |
Journal | Biometrics |
Volume | 69 |
Issue number | 3 |
DOIs | |
State | Published - Sep 2013 |
Keywords
- Gene expression
- Networks analysis
- Nonconvex minimization
- Penalty
- Truncated Lasso penalty