A novel combinatorial score for feature selection with P-Tree in DNA microarray data analysis

Yan Wang, Tingda Lu, William Perrizo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

DNA microarray experiments are being used to gather information from tissue and cell samples by generating thousands of gene expression measurements. Many researchers are conducting researches regarding gene expression differences, which is useful in disease diagnose, outcome prediction, cancer type classification and etc. In mining high-dimensional microarray data, feature selection is an important pre-processing stage. In the literature nearly all existing supervised feature selection methods use class labels as supervision information. In this paper, we propose a novel score using the label correlation in combination with the correlation between features. We design a Combinatorial Score feature selection algorithm in P-Tree structure and combine it with K-Nearest-Neighbor algorithm for breast cancer clinic metastasis time prediction. Our experiments suggest that our Combinatorial Score feature selection algorithm can find a subset of genes with high computation efficiency and significant performance for breast cancer clinical metastasis prediction.

Original languageEnglish (US)
Title of host publication19th International Conference on Software Engineering and Data Engineering 2010, SEDE 2010
Pages295-299
Number of pages5
StatePublished - Dec 1 2010
Event19th International Conference on Software Engineering and Data Engineering 2010, SEDE 2010 - San Francisco, CA, United States
Duration: Jun 16 2010Jun 18 2010

Publication series

Name19th International Conference on Software Engineering and Data Engineering 2010, SEDE 2010

Other

Other19th International Conference on Software Engineering and Data Engineering 2010, SEDE 2010
Country/TerritoryUnited States
CitySan Francisco, CA
Period6/16/106/18/10

Fingerprint

Dive into the research topics of 'A novel combinatorial score for feature selection with P-Tree in DNA microarray data analysis'. Together they form a unique fingerprint.

Cite this