Abstract
This paper systematically compares unsupervised word sense discrimination techniques that cluster instances of a target word that occur in raw text using both vector and similarity spaces. The context of each instance is represented as a vector in a high dimensional feature space. Discrimination is achieved by clustering these context vectors directly in vector space and also by finding pairwise similarities among the vectors and then clustering in similarity space. We employ two different representations of the context in which a target word occurs. First order context vectors represent the context of each instance of a target word as a vector of features that occur in that context. Second order context vectors are an indirect representation of the context based on the average of vectors that represent the words that occur in the context. We evaluate the discriminated clusters by carrying out experiments using sense–tagged instances of 24 SENSEVAL-2 words and the well known Line, Hard and Serve sense–tagged corpora.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the 8th Conference on Computational Natural Language Learning, CoNLL 2004 - Held in cooperation with HLT-NAACL 2004 |
Editors | Hwee Tou Ng, Ellen Riloff |
Publisher | Association for Computational Linguistics (ACL) |
ISBN (Electronic) | 1932432302, 9781932432305 |
State | Published - 2004 |
Event | 8th Conference on Computational Natural Language Learning, CoNLL 2004 - Boston, United States Duration: May 6 2004 → May 7 2004 |
Publication series
Name | Proceedings of the 8th Conference on Computational Natural Language Learning, CoNLL 2004 - Held in cooperation with HLT-NAACL 2004 |
---|
Conference
Conference | 8th Conference on Computational Natural Language Learning, CoNLL 2004 |
---|---|
Country/Territory | United States |
City | Boston |
Period | 5/6/04 → 5/7/04 |
Bibliographical note
Funding Information:This research is supported by a National Science Foundation Faculty Early CAREER Development Award (#0092784).
Publisher Copyright:
© Proceedings of the 8th Conference on Computational Natural Language Learning, CoNLL 2004 - Held in cooperation with HLT-NAACL 2004. All rights reserved.