Abstract
Abstract In many applications of clustering, solutions that are balanced, i.e., where the clusters obtained are of comparable sizes, are preferred. This chapter describes several approaches to obtaining balanced clustering results that also scale well to large data sets. First, we describe a general scalable framework for obtaining balanced clustering that first clusters only a small subset of the data and then efficiently allocates the rest of the data to these initial clusters while simultaneously refining the clustering. Next, we discuss how frequency sensitive competitive learning can be used for balanced clustering in both batch and on-line scenarios, and illustrate the mechanism with a case study of clustering directional data such as text documents. Finally, we briefly outline balanced clustering based on other methods such as graph partitioning and mixture modeling.
Original language | English (US) |
---|---|
Title of host publication | Constrained Clustering |
Subtitle of host publication | Advances in Algorithms, Theory, and Applications |
Publisher | CRC Press |
Pages | 171-200 |
Number of pages | 30 |
ISBN (Electronic) | 9781584889977 |
ISBN (Print) | 9781584889960 |
State | Published - Jan 1 2008 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2008, CRC Press. All rights reserved.