This paper describes a methodology for supervised word sense disambiguation that relies on standard machine learning algorithms to induce classifiers from sense-tagged training examples where the context in which ambiguous words occur are represented by simple lexical features. This constitutes a baseline approach since it produces classifiers based on easy to identify features that result in accurate disambiguation across a variety of languages. This paper reviews several systems based on this methodology that participated in the Spanish and English lexical sample tasks of the Senseval-2 comparative exercise among word sense disambiguation systems. These systems fared much better than standard baselines, and were within seven to ten percentage points of accuracy of the mostly highly ranked syste.
|Title of host publication
|Computational Linguistics and Intelligent Text Processing - 3rd International Conference, CICLing 2002, Proceedings
|Number of pages
|Published - 2002
|3rd Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2002 - Mexico City, Mexico
Duration: Feb 17 2002 → Feb 23 2002
|Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
|3rd Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2002
|2/17/02 → 2/23/02
Bibliographical notePublisher Copyright:
© Springer-Verlag Berlin Heidelberg 2002.