TY - GEN
T1 - Name discrimination and email clustering using unsupervised clustering and labeling of similar contexts
AU - Kulkarni, Anagha
AU - Pedersen, Ted
PY - 2005
Y1 - 2005
N2 - In this paper, we apply an unsupervised word sense discrim ination technique based on clustering similar contexts (Purandare and Pedersen, 2004) to the problems of name discrimination and email clus tering. Names of people, places, and organizations are not always unique. This can create a problem when we refer to or seek out information about such entities. When this occurs in written text, we show that we can clus-ter ambiguous names into unique groups by identifying which contexts are similar to each other. It has been previously shown by (Pedersen, Pu randare, and Kulkarni, 2005) that this approach can be successfully used for discrimination of names with two-way ambiguity. Here we show that it can be extended to multiway distinctions as well. We adapt the clus ter labeling technique introduced by (Kulkarni, 2005) for the multiway distinctions of name discrimination. On the similar lines of contextual similarity, we also observe that email messages can be treated as con texts, and that in clustering them together we are able to group them based on their underlying content rather than the occurrence of specific strings.
AB - In this paper, we apply an unsupervised word sense discrim ination technique based on clustering similar contexts (Purandare and Pedersen, 2004) to the problems of name discrimination and email clus tering. Names of people, places, and organizations are not always unique. This can create a problem when we refer to or seek out information about such entities. When this occurs in written text, we show that we can clus-ter ambiguous names into unique groups by identifying which contexts are similar to each other. It has been previously shown by (Pedersen, Pu randare, and Kulkarni, 2005) that this approach can be successfully used for discrimination of names with two-way ambiguity. Here we show that it can be extended to multiway distinctions as well. We adapt the clus ter labeling technique introduced by (Kulkarni, 2005) for the multiway distinctions of name discrimination. On the similar lines of contextual similarity, we also observe that email messages can be treated as con texts, and that in clustering them together we are able to group them based on their underlying content rather than the occurrence of specific strings.
UR - http://www.scopus.com/inward/record.url?scp=33751035358&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33751035358&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:33751035358
SN - 0972741216
SN - 9780972741217
T3 - Proceedings of the 2nd Indian International Conference on Artificial Intelligence, IICAI 2005
SP - 703
EP - 722
BT - Proceedings of the 2nd Indian International Conference on Artificial Intelligence, IICAI 2005
T2 - 2nd Indian International Conference on Artificial Intelligence, IICAI 2005
Y2 - 20 December 2005 through 22 December 2005
ER -