Methods for multi-category cancer diagnosis from gene expression data: A comprehensive evaluation to inform decision support system development

Alexander Statnikov, Constantin F. Aliferis, Ioannis Tsamardinos

Research output: Chapter in Book/Report/Conference proceedingChapter

6 Scopus citations

Abstract

Cancer diagnosis is a major clinical applications area of gene expression microarray technology. We are seeking to develop a system for cancer diagnostic model creation based on microarray data. We performed a comprehensive evaluation of several major classification algorithms, gene selection methods, and cross-validation designs using 11 datasets spanning 74 diagnostic categories (41 cancer types and 12 normal tissue types). The Multi-Category Support Vector Machine techniques by Crammer and Singer, Weston and Watkins, and one-versus-rest were found to be the best methods and they outperform other learning algorithms such as K-Nearest Neighbors and Neural Networks often to a remarkable degree. Gene selection techniques are shown to significantly improve classification performance. These results guided the development of a software system that fully automates cancer diagnostic model construction with quality on par with or, better than previously published results derived by expert human analysts.

Original languageEnglish (US)
Title of host publicationStudies in Health Technology and Informatics
Pages813-817
Number of pages5
Volume107
EditionPt 2
DOIs
StatePublished - 2004

Keywords

  • Artificial Intelligence
  • Computer-Assisted
  • Diagnosis
  • Oligonucleotide Array Sequence Analysis
  • Support Vector Machines

PubMed: MeSH publication types

  • Journal Article
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

Fingerprint

Dive into the research topics of 'Methods for multi-category cancer diagnosis from gene expression data: A comprehensive evaluation to inform decision support system development'. Together they form a unique fingerprint.

Cite this