Word sense discrimination by clustering contexts in vector and similarity spaces

Amruta Purandare, Ted Pedersen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

100 Scopus citations

Abstract

This paper systematically compares unsupervised word sense discrimination techniques that cluster instances of a target word that occur in raw text using both vector and similarity spaces. The context of each instance is represented as a vector in a high dimensional feature space. Discrimination is achieved by clustering these context vectors directly in vector space and also by finding pairwise similarities among the vectors and then clustering in similarity space. We employ two different representations of the context in which a target word occurs. First order context vectors represent the context of each instance of a target word as a vector of features that occur in that context. Second order context vectors are an indirect representation of the context based on the average of vectors that represent the words that occur in the context. We evaluate the discriminated clusters by carrying out experiments using sense–tagged instances of 24 SENSEVAL-2 words and the well known Line, Hard and Serve sense–tagged corpora.

Original languageEnglish (US)
Title of host publicationProceedings of the 8th Conference on Computational Natural Language Learning, CoNLL 2004 - Held in cooperation with HLT-NAACL 2004
EditorsHwee Tou Ng, Ellen Riloff
PublisherAssociation for Computational Linguistics (ACL)
ISBN (Electronic)1932432302, 9781932432305
StatePublished - 2004
Event8th Conference on Computational Natural Language Learning, CoNLL 2004 - Boston, United States
Duration: May 6 2004May 7 2004

Publication series

NameProceedings of the 8th Conference on Computational Natural Language Learning, CoNLL 2004 - Held in cooperation with HLT-NAACL 2004

Conference

Conference8th Conference on Computational Natural Language Learning, CoNLL 2004
Country/TerritoryUnited States
CityBoston
Period5/6/045/7/04

Bibliographical note

Funding Information:
This research is supported by a National Science Foundation Faculty Early CAREER Development Award (#0092784).

Publisher Copyright:
© Proceedings of the 8th Conference on Computational Natural Language Learning, CoNLL 2004 - Held in cooperation with HLT-NAACL 2004. All rights reserved.

Fingerprint

Dive into the research topics of 'Word sense discrimination by clustering contexts in vector and similarity spaces'. Together they form a unique fingerprint.

Cite this