Semantic relatedness study using second order co-occurrence vectors computed from biomedical corpora, UMLS and WordNet

Ying Liu, Bridget T. McInnes, Ted Pedersen, Genevieve Melton-Meaux, Serguei Pakhomov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

41 Scopus citations

Abstract

Automated measures of semantic relatedness are important for effectively processing medical data for a variety of tasks such as information retrieval and natural language processing. In this paper, we present a context vector approach that can compute the semantic relatedness between any pair of concepts in the Unified Medical Language System (UMLS). Our approach has been developed on a corpus of inpatient clinical reports. We use 430 pairs of clinical concepts manually rated for semantic relatedness as the reference standard. The experiments demonstrate that incorporating a combination of the UMLS and WordNet definitions can improve the semantic relatedness. The paper also shows that second order co-occurrence vector measure is a more effective approach than path-based methods for semantic relatedness.

Original languageEnglish (US)
Title of host publicationIHI'12 - Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Pages363-371
Number of pages9
DOIs
StatePublished - 2012
Event2nd ACM SIGHIT International Health Informatics Symposium, IHI'12 - Miami, FL, United States
Duration: Jan 28 2012Jan 30 2012

Publication series

NameIHI'12 - Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium

Other

Other2nd ACM SIGHIT International Health Informatics Symposium, IHI'12
Country/TerritoryUnited States
CityMiami, FL
Period1/28/121/30/12

Keywords

  • Computational linguistics
  • Semantic relatedness
  • UMLS
  • WordNet

Fingerprint

Dive into the research topics of 'Semantic relatedness study using second order co-occurrence vectors computed from biomedical corpora, UMLS and WordNet'. Together they form a unique fingerprint.

Cite this