Emerging topic detection using dictionary learning

Shiva Prasad Kasiviswanathan, Prem Melville, Arindam Banerjee, Vikas Sindhwani

Research output: Chapter in Book/Report/Conference proceedingConference contribution

97 Scopus citations


Streaming user-generated content in the form of blogs, microblogs, forums, and multimedia sharing sites, provides a rich source of data from which invaluable information and insights maybe gleaned. Given the vast volume of such social media data being continually generated, one of the challenges is to automatically tease apart the emerging topics of discussion from the constant background chatter. Such emerging topics can be identified by the appearance of multiple posts on a unique subject matter, which is distinct from previous online discourse. We address the problem of identifying emerging topics through the use of dictionary learning. We propose a two stage approach respectively based on detection and clustering of novel user-generated content. We derive a scalable approach by using the alternating directions method to solve the resulting optimization problems. Empirical results show that our proposed approach is more effective than several baselines in detecting emerging topics in traditional news story and newsgroup data. We also demonstrate the practical application to social media analysis, based on a study on streaming data from Twitter.

Original languageEnglish (US)
Title of host publicationCIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management
Number of pages10
StatePublished - 2011
Event20th ACM Conference on Information and Knowledge Management, CIKM'11 - Glasgow, United Kingdom
Duration: Oct 24 2011Oct 28 2011

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings


Other20th ACM Conference on Information and Knowledge Management, CIKM'11
Country/TerritoryUnited Kingdom


  • clustering
  • dictionary learning
  • l1 reconstruction


Dive into the research topics of 'Emerging topic detection using dictionary learning'. Together they form a unique fingerprint.

Cite this