Abstract
Acronyms are increasingly prevalent in biomedical text, and the task of acronym disambiguation is fundamentally important for biomedical natural language processing systems. Several groups have generated sense inventories of acronym long form expansions from the biomedical literature. Long form sense inventories, however, may contain conceptually redundant expansions that negatively affect their quality. Our approach to improving sense inventories consists of mapping long form expansions to concepts in the Unified Medical Language System (UMLS) with subsequent application of a semantic similarity algorithm based upon conceptual overlap. We evaluated this approach on a reference standard developed for ten acronyms. A total of 119 of 155 (78%) long forms mapped to concepts in the UMLS. Our approach identified synonymous long forms with a sensitivity of 70.2% and a positive predictive value of 96.3%. Although further refinements are needed, this study demonstrates the potential value of using automated techniques to merge synonymous biomedical acronym long forms to improve the quality of biomedical acronym sense inventories.
Original language | English (US) |
---|---|
Pages | 46-52 |
Number of pages | 7 |
State | Published - 2010 |
Event | 2nd Louhi Workshop on Text and Data Mining of Health Documents, Louhi 2010 - Los Angeles, United States Duration: Jun 5 2010 → … |
Conference
Conference | 2nd Louhi Workshop on Text and Data Mining of Health Documents, Louhi 2010 |
---|---|
Country/Territory | United States |
City | Los Angeles |
Period | 6/5/10 → … |
Bibliographical note
Publisher Copyright:© 2010 Association for Computational Linguistics