This paper presents a method that uses gene ontologies, together with the paradigm of relational subgroup discovery, to help find description of groups of genes differentially expressed in specific cancers. The descriptions are represented by means of relational features, extracted from publicly available gene ontology information, and are straightforwardly interpretable by the medical experts. We applied the proposed method to two known data sets: (i) acute lymphoblastic leukemia (ALL) vs. acute myeloid leukemia (AML) and (ii) classification of fourteen types of cancer. Significant number of discovered groups of genes had a description, confirmed by the medical expert, which highlighted the underlying biological process that is responsible for distinguishing one class from the other classes. We view our methodology not just as a prototypical example of applying more sophisticated machine learning algorithms to gene expression analysis, but also as a motivation for developing increasingly more sophisticated functional annotations and ontologies, that can be processed by such learning algorithms.
|Original language||English (US)|
|Title of host publication||STAIRS 2006 - Proceedings of the 3rd Starting AI Researchers' Symposium|
|Editors||Loris Penserini, Pavlos Peppas, Anna Perini|
|Publisher||IOS Press BV|
|Number of pages||12|
|ISBN (Electronic)||1586036459, 9781586036454|
|State||Published - 2006|
|Event||3rd European Starting AI Researchers Symposium, STAIRS 2006 - Trento, Italy|
Duration: May 23 2006 → …
|Name||Frontiers in Artificial Intelligence and Applications|
|Conference||3rd European Starting AI Researchers Symposium, STAIRS 2006|
|Period||5/23/06 → …|
Bibliographical noteFunding Information:
The research of I.T. and N.L. is supported by the Slovenian Ministry of Higher Education, Science and Technology. F.Z. is supported by the Czech Academy of Sciences through the project KJB201210501 Logic Based Machine Learning for Analysis of Genomic Data.
©2006 The authors. All rights reserved.
- Inductive logic programming
- Learning from structured data
- Learning in bioinformatics
- Relational learning
- Scientific discovery