Topic-driven clustering for document datasets

Ying Zhao, George Karypis

Research output: Contribution to conferencePaperpeer-review

32 Scopus citations

Abstract

In this paper, we define the problem of topic-driven clustering, which organizes a document collection ac cording to a given set of topics. We propose three topic-driven schemes that consider the similarity be tween documents and topics and the relationship among documents themselves simultaneously. We present a comprehensive experimental evaluation of the proposed topic-driven schemes on five datasets. Our experimental results show that the proposed topic-driven schemes are efficient and effective with topic prototypes of different levels of specificity.

Original languageEnglish (US)
Pages358-369
Number of pages12
StatePublished - Dec 1 2005
Event5th SIAM International Conference on Data Mining, SDM 2005 - Newport Beach, CA, United States
Duration: Apr 21 2005Apr 23 2005

Other

Other5th SIAM International Conference on Data Mining, SDM 2005
CountryUnited States
CityNewport Beach, CA
Period4/21/054/23/05

Cite this