Acronyms are increasingly prevalent in biomedical text, and the task of acronym disambiguation is fundamentally important for biomedical natural language processing systems. Several groups have generated sense inventories of acronym long form expansions from the biomedical literature. Long form sense inventories, however, may contain conceptually redundant expansions that negatively affect their quality. Our approach to improving sense inventories consists of mapping long form expansions to concepts in the Unified Medical Language System (UMLS) with subsequent application of a semantic similarity algorithm based upon conceptual overlap. We evaluated this approach on a reference standard developed for ten acronyms. A total of 119 of 155 (78%) long forms mapped to concepts in the UMLS. Our approach identified synonymous long forms with a sensitivity of 70.2% and a positive predictive value of 96.3%. Although further refinements are needed, this study demonstrates the potential value of using automated techniques to merge synonymous biomedical acronym long forms to improve the quality of biomedical acronym sense inventories.
|Original language||English (US)|
|Number of pages||7|
|State||Published - 2010|
|Event||2nd Louhi Workshop on Text and Data Mining of Health Documents, Louhi 2010 - Los Angeles, United States|
Duration: Jun 5 2010 → …
|Conference||2nd Louhi Workshop on Text and Data Mining of Health Documents, Louhi 2010|
|Period||6/5/10 → …|
Bibliographical noteFunding Information:
This work was supported by the University of Minnesota Institute for Health Informatics and Department of Surgery and by the National Library of Medicine (#R01 LM009623-01). We would like to thank Fairview Health Services for ongoing support of this research.
© 2010 Association for Computational Linguistics