GEMS: A system for automated cancer diagnosis and biomarker discovery from microarray gene expression data

Alexander Statnikov, Ioannis Tsamardinos, Yerbolat Dosbayev, Constantin F. Aliferis

Research output: Contribution to journalArticlepeer-review

149 Scopus citations


The success of treatment of patients with cancer depends on establishing an accurate diagnosis. To this end, we have built a system called GEMS (gene expression model selector) for the automated development and evaluation of high-quality cancer diagnostic models and biomarker discovery from microarray gene expression data. In order to determine and equip the system with the best performing diagnostic methodologies in this domain, we first conducted a comprehensive evaluation of classification algorithms using 11 cancer microarray datasets. In this paper we present a preliminary evaluation of the system with five new datasets. The performance of the models produced automatically by GEMS is comparable or better than the results obtained by human analysts. Additionally, we performed a cross-dataset evaluation of the system. This involved using a dataset to build a diagnostic model and to estimate its future performance, then applying this model and evaluating its performance on a different dataset. We found that models produced by GEMS indeed perform well in independent samples and, furthermore, the cross-validation performance estimates output by the system approximate well the error obtained by the independent validation. GEMS is freely available for download for non-commercial use from

Original languageEnglish (US)
Pages (from-to)491-503
Number of pages13
JournalInternational Journal of Medical Informatics
Issue number7-8
StatePublished - Aug 2005

Bibliographical note

Funding Information:
This research was supported by NIH grants RO1 LM007948-01 and P20 LM 007613-01. We also acknowledge all developers of the systems listed in Table 1 for access to their software. In particular, we would like to acknowledge Partek Inc. for arranging a web-conference with demonstration of their software.


  • Artificial intelligence
  • Computer-assisted
  • Decision support systems
  • Diagnosis
  • Gene expression microarray analysis
  • Neoplasms


Dive into the research topics of 'GEMS: A system for automated cancer diagnosis and biomarker discovery from microarray gene expression data'. Together they form a unique fingerprint.

Cite this