Abstract
Model–based clustering is a popular technique relying on the notion of finite mixture models that proved to be efficient in modeling heterogeneity in data. The underlying idea is to model each data group by a particular mixture component. This relationship between mixed distributions and clusters forms an attractive interpretation of groups: each cluster is assumed to be a sample from the corresponding distribution. In practice, however, there are many issues that have to be accounted for by the researcher. The area of model–based clustering is very dynamic and rapidly developing, with many questions yet to be answered. In this paper, we review and discuss the latest developments in model–based clustering including semi–supervised clustering, non–parametric mixture modeling, choice of initialization strategies, merging mixture components for clustering, handling spurious solutions, and assessing variability of obtained partitions. We also demonstrate the utility of model–based clustering by considering several challenging applications to real–life problems.
Original language | English (US) |
---|---|
Title of host publication | Partitional Clustering Algorithms |
Publisher | Springer International Publishing |
Pages | 1-39 |
Number of pages | 39 |
ISBN (Electronic) | 9783319092591 |
ISBN (Print) | 9783319092584 |
DOIs | |
State | Published - Jan 1 2015 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© Springer International Publishing Switzerland 2015.
Keywords
- Finite mixture model
- Initialization strategy
- Merging mixture components
- Model-based clustering
- Semi-supervised clustering
- Spurious solutions
- Variable selection