TY - JOUR
T1 - Data clustering in life sciences
AU - Zhao, Ying
AU - Karypis, George
PY - 2005/9
Y1 - 2005/9
N2 - Clustering has a wide range of applications in life sciences and over the years has been used in many areas ranging from the analysis of clinical information, phylogeny, genomics, and proteomics. The primary goal of this article is to provide an overview of the various issues involved in clustering large biological datasets, describe the merits and underlying assumptions of some of the commonly used clustering approaches, and provide insights on how to cluster datasets arising in various areas within life sciences. We also provide a brief introduction to CLUTO, a general purpose toolkit for clustering various datasets, with an emphasis on its applications to problems and analysis requirements within life sciences.
AB - Clustering has a wide range of applications in life sciences and over the years has been used in many areas ranging from the analysis of clinical information, phylogeny, genomics, and proteomics. The primary goal of this article is to provide an overview of the various issues involved in clustering large biological datasets, describe the merits and underlying assumptions of some of the commonly used clustering approaches, and provide insights on how to cluster datasets arising in various areas within life sciences. We also provide a brief introduction to CLUTO, a general purpose toolkit for clustering various datasets, with an emphasis on its applications to problems and analysis requirements within life sciences.
KW - CLUTO
KW - Clustering algorithms
KW - Microarray data
KW - Similarity between objects
UR - https://www.scopus.com/pages/publications/23944524534
UR - https://www.scopus.com/inward/citedby.url?scp=23944524534&partnerID=8YFLogxK
U2 - 10.1385/mb:31:1:055
DO - 10.1385/mb:31:1:055
M3 - Review article
C2 - 16118415
AN - SCOPUS:23944524534
SN - 1073-6085
VL - 31
SP - 55
EP - 80
JO - Molecular Biotechnology
JF - Molecular Biotechnology
IS - 1
ER -