The availability of genome-wide biological network data opens up new possibilities to discover novel biomarkers and elucidate cancer-related complex mechanisms at network level. In this paper, we propose a novel module-based feature selection framework, which integrates biological network information and gene expression data to identify biomarkers, not as individual genes but as functional modules. Also, a large-scale analysis of ensemble feature selection concept is presented. The method allows combining features selected from multiple runs with various data subsampling to increase the reliability and classification accuracy of the final set of selected features. The results from four breast cancer studies demonstrate that the identified module biomarkers achieve: i) higher classification accuracy in independent validation datasets; ii) better reproducibility than individual gene biomarkers; iii) improved biological interpretability; and iv) enhanced enrichment in cancer-related "disease drivers".
|Original language||English (US)|
|Title of host publication||Proceedings - 2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010|
|Number of pages||5|
|State||Published - 2010|
|Event||2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010 - Hong Kong, China|
Duration: Dec 18 2010 → Dec 21 2010
|Name||Proceedings - 2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010|
|Conference||2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010|
|Period||12/18/10 → 12/21/10|
Bibliographical noteFunding Information:
* Supported by Grant 6287 from the Austrian National Bank. by Grant P8947Med from the Austrian Science Foundation and by the Pro- vince of Tyrol. t Address for correspondence: Richard Greil, M.D. Laboratory of Molecular Cytology, Department of Internal Medicine, University of Innsbruck, Anichstrasse 35, A-6020 Innsbruck, Austria, Tel.: 0512 504 3343, Fax: 0512 504 3343.