Data clustering in life sciences

Ying Zhao, George Karypis

Research output: Contribution to journalReview articlepeer-review

92 Scopus citations

Abstract

Clustering has a wide range of applications in life sciences and over the years has been used in many areas ranging from the analysis of clinical information, phylogeny, genomics, and proteomics. The primary goal of this article is to provide an overview of the various issues involved in clustering large biological datasets, describe the merits and underlying assumptions of some of the commonly used clustering approaches, and provide insights on how to cluster datasets arising in various areas within life sciences. We also provide a brief introduction to CLUTO, a general purpose toolkit for clustering various datasets, with an emphasis on its applications to problems and analysis requirements within life sciences.

Original languageEnglish (US)
Pages (from-to)55-80
Number of pages26
JournalMolecular Biotechnology
Volume31
Issue number1
DOIs
StatePublished - Sep 2005

Keywords

  • CLUTO
  • Clustering algorithms
  • Microarray data
  • Similarity between objects

Fingerprint

Dive into the research topics of 'Data clustering in life sciences'. Together they form a unique fingerprint.

Cite this