Clustering with balancing constraints

Arindam Banerjee, Joydeep Ghosh

Research output: Chapter in Book/Report/Conference proceedingChapter

3 Scopus citations

Abstract

Abstract In many applications of clustering, solutions that are balanced, i.e., where the clusters obtained are of comparable sizes, are preferred. This chapter describes several approaches to obtaining balanced clustering results that also scale well to large data sets. First, we describe a general scalable framework for obtaining balanced clustering that first clusters only a small subset of the data and then efficiently allocates the rest of the data to these initial clusters while simultaneously refining the clustering. Next, we discuss how frequency sensitive competitive learning can be used for balanced clustering in both batch and on-line scenarios, and illustrate the mechanism with a case study of clustering directional data such as text documents. Finally, we briefly outline balanced clustering based on other methods such as graph partitioning and mixture modeling.

Original languageEnglish (US)
Title of host publicationConstrained Clustering
Subtitle of host publicationAdvances in Algorithms, Theory, and Applications
PublisherCRC Press
Pages171-200
Number of pages30
ISBN (Electronic)9781584889977
ISBN (Print)9781584889960
StatePublished - Jan 1 2008

Fingerprint Dive into the research topics of 'Clustering with balancing constraints'. Together they form a unique fingerprint.

Cite this