Automatic cluster stopping with criterion functions and the gap statistic

Ted Pedersen, Anagha Kulkarni

Research output: Contribution to conferencePaperpeer-review

28 Scopus citations

Abstract

SenseClusters is a freely available system that clusters similar contexts. It can be applied to a wide range of problems, although here we focus on word sense and name discrimination. It supports several different measures for automatically determining the number of clusters in which a collection of contexts should be grouped. These can be used to discover the number of senses in which a word is used in a large corpus of text, or the number of entities that share the same name. There are three measures based on clustering criterion functions, and another on the Gap Statistic.

Original languageEnglish (US)
Pages276-279
Number of pages4
StatePublished - 2006
Event2006 Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, HLT-NAACL 2006 - New York City, United States
Duration: Jun 4 2006Jun 9 2006

Conference

Conference2006 Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, HLT-NAACL 2006
Country/TerritoryUnited States
CityNew York City
Period6/4/066/9/06

Bibliographical note

Publisher Copyright:
© 2006 Association for Computational Linguistics.

Fingerprint

Dive into the research topics of 'Automatic cluster stopping with criterion functions and the gap statistic'. Together they form a unique fingerprint.

Cite this