How many different "john smiths", and who are they?

Anagha Kulkarni, Ted Pedersen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

In this work we propose three unsupervised measures to automatically identify the number of distinct entities a given ambiguous name refers to in a corpus. We experiment with 22 artificially created name conflations and observe that the measure (PK2) formulated as the ratio of two successive clustering criterion function values outperforms the other two measures. We also describe a method to assign a unique label to each discovered cluster so as to identify the underlying entity that it refers to.

Original languageEnglish (US)
Title of host publicationProceedings of the 21st National Conference on Artificial Intelligence and the 18th Innovative Applications of Artificial Intelligence Conference, AAAI-06/IAAI-06
Pages1885-1886
Number of pages2
StatePublished - Nov 13 2006
Event21st National Conference on Artificial Intelligence and the 18th Innovative Applications of Artificial Intelligence Conference, AAAI-06/IAAI-06 - Boston, MA, United States
Duration: Jul 16 2006Jul 20 2006

Publication series

NameProceedings of the National Conference on Artificial Intelligence
Volume2

Other

Other21st National Conference on Artificial Intelligence and the 18th Innovative Applications of Artificial Intelligence Conference, AAAI-06/IAAI-06
CountryUnited States
CityBoston, MA
Period7/16/067/20/06

Fingerprint Dive into the research topics of 'How many different "john smiths", and who are they?'. Together they form a unique fingerprint.

Cite this