Distinguishing word senses in untagged text

Ted Pedersen, Rebecca Bruce

Research output: Contribution to conferencePaperpeer-review

89 Scopus citations

Abstract

This paper describes an experimental com¬parison of three unsupervised learning al¬gorithms that distinguish the sense of an ambiguous word in untagged text. The methods described in this paper, McQuitty's similarity analysis, Ward's minimum-variance method, and the EM algorithm, assign each instance of an am¬biguous word to a known sense definition based solely on the values of automatically identifiable features in text. These meth¬ods and feature sets are found to be more successful in disambiguating nouns rather than adjectives or verbs. Overall, the most accurate of these procedures is McQuitty's similarity analysis in combination with a high dimensional feature set.

Original languageEnglish (US)
Pages197-207
Number of pages11
StatePublished - 1997
Externally publishedYes
Event2nd Conference on Empirical Methods in Natural Language Processing, EMNLP 1997 - Providence, United States
Duration: Aug 1 1997Aug 2 1997

Conference

Conference2nd Conference on Empirical Methods in Natural Language Processing, EMNLP 1997
Country/TerritoryUnited States
CityProvidence
Period8/1/978/2/97

Bibliographical note

Publisher Copyright:
© Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing, EMNLP 1997. All rights reserved.

Fingerprint

Dive into the research topics of 'Distinguishing word senses in untagged text'. Together they form a unique fingerprint.

Cite this