Abstract
Word2Vec’s Skip Gram model is the current state-of-the-art approach for estimating the distributed representation of words. However, it assumes a single vector per word, which is not well-suited for representing words that have multiple senses. This work presents LDMI, a new model for estimating distributional representations of words. LDMI relies on the idea that, if a word carries multiple senses, then having a different representation for each of its senses should lead to a lower loss associated with predicting its co-occurring words, as opposed to the case when a single vector representation is used for all the senses. After identifying the multi-sense words, LDMI clusters the occurrences of these words to assign a sense to each occurrence. Experiments on the contextual word similarity task show that LDMI leads to better performance than competing approaches.
Original language | English (US) |
---|---|
Title of host publication | Advances in Knowledge Discovery and Data Mining - 22nd Pacific-Asia Conference, PAKDD 2018, Proceedings |
Editors | Bao Ho, Dinh Phung, Geoffrey I. Webb, Vincent S. Tseng, Mohadeseh Ganji, Lida Rashidi |
Publisher | Springer Verlag |
Pages | 337-349 |
Number of pages | 13 |
ISBN (Print) | 9783319930367 |
DOIs | |
State | Published - 2018 |
Event | 22nd Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2018 - Melbourne, Australia Duration: Jun 3 2018 → Jun 6 2018 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 10938 LNAI |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Other
Other | 22nd Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2018 |
---|---|
Country/Territory | Australia |
City | Melbourne |
Period | 6/3/18 → 6/6/18 |
Bibliographical note
Publisher Copyright:© 2018, Springer International Publishing AG, part of Springer Nature.