Abstract
Our objective is to develop a framework for creating reference standards for functional testing of computerized measures of semantic relatedness. Currently, research on computerized approaches to semantic relatedness between biomedical concepts relies on reference standards created for specific purposes using a variety of methods for their analysis. In most cases, these reference standards are not publicly available and the published information provided in manuscripts that evaluate computerized semantic relatedness measurement approaches is not sufficient to reproduce the results. Our proposed framework is based on the experiences of medical informatics and computational linguistics communities and addresses practical and theoretical issues with creating reference standards for semantic relatedness. We demonstrate the use of the framework on a pilot set of 101 medical term pairs rated for semantic relatedness by 13 medical coding experts. While the reliability of this particular reference standard is in the " moderate" range; we show that using clustering and factor analyses offers a data-driven approach to finding systematic differences among raters and identifying groups of potential outliers. We test two ontology-based measures of relatedness and provide both the reference standard containing individual ratings and the R program used to analyze the ratings as open-source. Currently, these resources are intended to be used to reproduce and compare results of studies involving computerized measures of semantic relatedness. Our framework may be extended to the development of reference standards in other research areas in medical informatics including automatic classification, information retrieval from medical records and vocabulary/ontology development.
Original language | English (US) |
---|---|
Pages (from-to) | 251-265 |
Number of pages | 15 |
Journal | Journal of Biomedical Informatics |
Volume | 44 |
Issue number | 2 |
DOIs | |
State | Published - Apr 2011 |
Bibliographical note
Funding Information:We would like to thank the medical coding experts at the Mayo Clinic for participating in developing the reference standards referred to in this study. This work was supported in part by the National Library of Medicine Grants T 15 LM07041-19 and R01 LM009623-01A2 .
Keywords
- Inter-annotator agreement
- Reference standards
- Reliability
- Semantic relatedness
Fingerprint
Dive into the research topics of 'Towards a framework for developing semantic relatedness reference standards'. Together they form a unique fingerprint.Datasets
-
Semantic Relatedness and Similarity Reference Standards for Medical Terms
Pakhomov, S., Data Repository for the University of Minnesota, 2018
DOI: 10.13020/D6CX04, http://hdl.handle.net/11299/196265
Dataset