Skip to main navigation Skip to search Skip to main content

Semi-supervised maximum entropy based approach to acronym and abbreviation normalization in medical texts

Research output: Contribution to journalConference articlepeer-review

Abstract

Text normalization is an important aspect of successful information retrieval from medical documents such as clinical notes, radiology reports and discharge summaries. In the medical domain, a significant part of the general problem of text normalization is abbreviation and acronym disambiguation. Numerous abbreviations are used routinely throughout such texts and knowing their meaning is critical to data retrieval from the document. In this paper I will demonstrate a method of automatically generating training data for Maximum Entropy (ME) modeling of abbreviations and acronyms and will show that using ME modeling is a promising technique for abbreviation and acronym normalization. I report on the results of an experiment involving training a number of ME models used to normalize abbreviations and acronyms on a sample of 10,000 rheumatology notes with ?89% accuracy.

Original languageEnglish (US)
Pages (from-to)160-167
Number of pages8
JournalProceedings of the Annual Meeting of the Association for Computational Linguistics
Volume2002-July
StatePublished - 2002
Externally publishedYes
Event40th Annual Meeting of the Association for Computational Linguistics, ACL 2002 - Philadelphia, United States
Duration: Jul 7 2002Jul 12 2002

Bibliographical note

Publisher Copyright:
© 1992 Proceedings of the Annual Meeting of the Association for Computational Linguistics. All rights reserved.

Fingerprint

Dive into the research topics of 'Semi-supervised maximum entropy based approach to acronym and abbreviation normalization in medical texts'. Together they form a unique fingerprint.

Cite this