TY - GEN
T1 - Summarizing gene-expression-based classifiers by meta-mining comprehensible relational patterns
AU - Železný, Filip
AU - Štěpánková, Olga
AU - Tolar, Jakub
AU - Lavrač, Nada
PY - 2006/12/1
Y1 - 2006/12/1
N2 - We propose a methodology for predictive classification from gene expression data, able to combine the robustness of high-dimensional statistical classification methods with the comprehensibility and interpretability of simple logic-based models. We first construct a robust classifier combining contributions of a large number of gene expression values, and then (meta)-mine the classifier for compact summarizations of subgroups among genes associated with a given class therein. The subgroups are described by means of relational logic features extracted from publicly available gene ontology information. The curse of dimensionality pertaining to the gene expression based classification problem due to the large number of attributes (genes) is turned into an advantage in the secondary, meta-mining task as here the original attributes become learning examples. We cross-validate the proposed method on two classification problems: (i) acute lymphoblastic leukemia (ALL) vs. acute myeloid leukemia (AML), (ii) seven subclasses of ALL.
AB - We propose a methodology for predictive classification from gene expression data, able to combine the robustness of high-dimensional statistical classification methods with the comprehensibility and interpretability of simple logic-based models. We first construct a robust classifier combining contributions of a large number of gene expression values, and then (meta)-mine the classifier for compact summarizations of subgroups among genes associated with a given class therein. The subgroups are described by means of relational logic features extracted from publicly available gene ontology information. The curse of dimensionality pertaining to the gene expression based classification problem due to the large number of attributes (genes) is turned into an advantage in the secondary, meta-mining task as here the original attributes become learning examples. We cross-validate the proposed method on two classification problems: (i) acute lymphoblastic leukemia (ALL) vs. acute myeloid leukemia (AML), (ii) seven subclasses of ALL.
KW - Gene expression microarrays
KW - Gene ontology
KW - Relational data mining
UR - http://www.scopus.com/inward/record.url?scp=33847176643&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33847176643&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:33847176643
SN - 0889865787
SN - 9780889865785
T3 - Proceedings of the Fourth IASTED International Conference on Biomedical Engineering
SP - 19
EP - 24
BT - Proceedings of the Fourth IASTED International Conference on Biomedical Engineering
T2 - 4th IASTED International Conference on Biomedical Engineering
Y2 - 15 February 2006 through 17 February 2006
ER -