Improving Name Discrimination: A Language Salad Approach

Ted Pedersen, Anagha Kulkarni, Roxana Angheluta, Zornitsa Kozareva, Thamar Solorio

Research output: Contribution to conferencePaperpeer-review

1 Scopus citations

Abstract

This paper describes a method of discriminating ambiguous names that relies upon features found in corpora of a more abundant language. In particular, we discriminate ambiguous names in Bulgarian, Romanian, and Spanish corpora using information derived from much larger quantities of English data. We also mix together occurrences of the ambiguous name found in English with the occurrences of the name in the language in which we are trying to discriminate. We refer to this as a language salad, and find that it often results in even better performance than when only using English or the language itself as the source of information for discrimination.

Original languageEnglish (US)
Pages25-32
Number of pages8
StatePublished - 2006
Externally publishedYes
Event2006 International Workshop on Cross-Language Knowledge Induction - Trento, Italy
Duration: Apr 3 2006 → …

Conference

Conference2006 International Workshop on Cross-Language Knowledge Induction
Country/TerritoryItaly
CityTrento
Period4/3/06 → …

Bibliographical note

Publisher Copyright:
© 2006 Cross-Language Knowledge Induction Workshop - International Workshop held as part of EACL 2006: 11th Conference of the European Chapter of the Association for Computational Linguistics. All rights reserved.

Fingerprint

Dive into the research topics of 'Improving Name Discrimination: A Language Salad Approach'. Together they form a unique fingerprint.

Cite this