A three-stage framework for gene expression data analysis by L1-norm support vector regression.

Hyunsoo Kim, Jeff X. Zhou, Herbert C. Morse, Haesun Park

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

The identification of discriminative genes for categorical phenotypes in microarray gene expression data analysis has been extensively studied, especially for disease diagnosis. In recent biological experiments, continuous phenotypes have also been dealt with. For example, the extent of programmed cell death (apoptosis) can be measured by the level of caspase 3 enzyme. Thus, an effective gene selection method for continuous phenotypes is desirable. In this paper, we describe a three-stage framework for gene expression data analysis based on L1-norm support vector regression (L1-SVR). The first stage ranks genes by recursive multiple feature elimination based on L1-SVR. In the second stage, the minimal genes are determined by a kernel regression, which yields the lowest ten-fold cross-validation error. In the last stage, the final non-linear regression model is built with the minimal genes and optimal parameters found by leave-one-out cross-validation. The experimental results show a significant improvement over the current state-of-the-art approach, i.e., the two-stage process, which consists of the gene selection based on L1-SVR and the third stage of the proposed method.

Original languageEnglish (US)
Pages (from-to)51-62
Number of pages12
JournalInternational Journal of Bioinformatics Research and Applications
Volume1
Issue number1
DOIs
StatePublished - 2005
Externally publishedYes

Fingerprint

Dive into the research topics of 'A three-stage framework for gene expression data analysis by L1-norm support vector regression.'. Together they form a unique fingerprint.

Cite this