Summarizing gene-expression-based classifiers by meta-mining comprehensible relational patterns

Filip Železný, Olga Štěpánková, Jakub Tolar, Nada Lavrač

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose a methodology for predictive classification from gene expression data, able to combine the robustness of high-dimensional statistical classification methods with the comprehensibility and interpretability of simple logic-based models. We first construct a robust classifier combining contributions of a large number of gene expression values, and then (meta)-mine the classifier for compact summarizations of subgroups among genes associated with a given class therein. The subgroups are described by means of relational logic features extracted from publicly available gene ontology information. The curse of dimensionality pertaining to the gene expression based classification problem due to the large number of attributes (genes) is turned into an advantage in the secondary, meta-mining task as here the original attributes become learning examples. We cross-validate the proposed method on two classification problems: (i) acute lymphoblastic leukemia (ALL) vs. acute myeloid leukemia (AML), (ii) seven subclasses of ALL.

Original languageEnglish (US)
Title of host publicationProceedings of the Fourth IASTED International Conference on Biomedical Engineering
Pages19-24
Number of pages6
Volume2006
StatePublished - Dec 1 2006
Event4th IASTED International Conference on Biomedical Engineering - Innsbruck, Austria
Duration: Feb 15 2006Feb 17 2006

Other

Other4th IASTED International Conference on Biomedical Engineering
CountryAustria
CityInnsbruck
Period2/15/062/17/06

Fingerprint

Gene expression
Classifiers
Genes
Ontology

Keywords

  • Gene expression microarrays
  • Gene ontology
  • Relational data mining

Cite this

Železný, F., Štěpánková, O., Tolar, J., & Lavrač, N. (2006). Summarizing gene-expression-based classifiers by meta-mining comprehensible relational patterns. In Proceedings of the Fourth IASTED International Conference on Biomedical Engineering (Vol. 2006, pp. 19-24)

Summarizing gene-expression-based classifiers by meta-mining comprehensible relational patterns. / Železný, Filip; Štěpánková, Olga; Tolar, Jakub; Lavrač, Nada.

Proceedings of the Fourth IASTED International Conference on Biomedical Engineering. Vol. 2006 2006. p. 19-24.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Železný, F, Štěpánková, O, Tolar, J & Lavrač, N 2006, Summarizing gene-expression-based classifiers by meta-mining comprehensible relational patterns. in Proceedings of the Fourth IASTED International Conference on Biomedical Engineering. vol. 2006, pp. 19-24, 4th IASTED International Conference on Biomedical Engineering, Innsbruck, Austria, 2/15/06.
Železný F, Štěpánková O, Tolar J, Lavrač N. Summarizing gene-expression-based classifiers by meta-mining comprehensible relational patterns. In Proceedings of the Fourth IASTED International Conference on Biomedical Engineering. Vol. 2006. 2006. p. 19-24
Železný, Filip ; Štěpánková, Olga ; Tolar, Jakub ; Lavrač, Nada. / Summarizing gene-expression-based classifiers by meta-mining comprehensible relational patterns. Proceedings of the Fourth IASTED International Conference on Biomedical Engineering. Vol. 2006 2006. pp. 19-24
@inproceedings{af7f596173f4457c99395e626083e724,
title = "Summarizing gene-expression-based classifiers by meta-mining comprehensible relational patterns",
abstract = "We propose a methodology for predictive classification from gene expression data, able to combine the robustness of high-dimensional statistical classification methods with the comprehensibility and interpretability of simple logic-based models. We first construct a robust classifier combining contributions of a large number of gene expression values, and then (meta)-mine the classifier for compact summarizations of subgroups among genes associated with a given class therein. The subgroups are described by means of relational logic features extracted from publicly available gene ontology information. The curse of dimensionality pertaining to the gene expression based classification problem due to the large number of attributes (genes) is turned into an advantage in the secondary, meta-mining task as here the original attributes become learning examples. We cross-validate the proposed method on two classification problems: (i) acute lymphoblastic leukemia (ALL) vs. acute myeloid leukemia (AML), (ii) seven subclasses of ALL.",
keywords = "Gene expression microarrays, Gene ontology, Relational data mining",
author = "Filip Železn{\'y} and Olga Štěp{\'a}nkov{\'a} and Jakub Tolar and Nada Lavrač",
year = "2006",
month = "12",
day = "1",
language = "English (US)",
isbn = "0889865787",
volume = "2006",
pages = "19--24",
booktitle = "Proceedings of the Fourth IASTED International Conference on Biomedical Engineering",

}

TY - GEN

T1 - Summarizing gene-expression-based classifiers by meta-mining comprehensible relational patterns

AU - Železný, Filip

AU - Štěpánková, Olga

AU - Tolar, Jakub

AU - Lavrač, Nada

PY - 2006/12/1

Y1 - 2006/12/1

N2 - We propose a methodology for predictive classification from gene expression data, able to combine the robustness of high-dimensional statistical classification methods with the comprehensibility and interpretability of simple logic-based models. We first construct a robust classifier combining contributions of a large number of gene expression values, and then (meta)-mine the classifier for compact summarizations of subgroups among genes associated with a given class therein. The subgroups are described by means of relational logic features extracted from publicly available gene ontology information. The curse of dimensionality pertaining to the gene expression based classification problem due to the large number of attributes (genes) is turned into an advantage in the secondary, meta-mining task as here the original attributes become learning examples. We cross-validate the proposed method on two classification problems: (i) acute lymphoblastic leukemia (ALL) vs. acute myeloid leukemia (AML), (ii) seven subclasses of ALL.

AB - We propose a methodology for predictive classification from gene expression data, able to combine the robustness of high-dimensional statistical classification methods with the comprehensibility and interpretability of simple logic-based models. We first construct a robust classifier combining contributions of a large number of gene expression values, and then (meta)-mine the classifier for compact summarizations of subgroups among genes associated with a given class therein. The subgroups are described by means of relational logic features extracted from publicly available gene ontology information. The curse of dimensionality pertaining to the gene expression based classification problem due to the large number of attributes (genes) is turned into an advantage in the secondary, meta-mining task as here the original attributes become learning examples. We cross-validate the proposed method on two classification problems: (i) acute lymphoblastic leukemia (ALL) vs. acute myeloid leukemia (AML), (ii) seven subclasses of ALL.

KW - Gene expression microarrays

KW - Gene ontology

KW - Relational data mining

UR - http://www.scopus.com/inward/record.url?scp=33847176643&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33847176643&partnerID=8YFLogxK

M3 - Conference contribution

SN - 0889865787

SN - 9780889865785

VL - 2006

SP - 19

EP - 24

BT - Proceedings of the Fourth IASTED International Conference on Biomedical Engineering

ER -