TY - JOUR
T1 - A data-driven approach for extracting "the most specific term" for ontology development.
AU - Savova, Guergana K.
AU - Harris, Marcelline
AU - Johnson, Thomas
AU - Pakhomov, Serguei V.
AU - Chute, Christopher G.
PY - 2003
Y1 - 2003
N2 - We present a data-driven approach to extract the "most specific" terms relevant to an ontology of functioning, disability and health. The algorithm is a combination of statistical and linguistic approaches. The statistical filter is based on the frequency of the content words in a given text string; the linguistic heuristic is an extension of existing algorithms but goes beyond noun phrases and is formulated as a "complete syntactic node". Thus, it can be applied to any syntactic node of interest in the particular domain. Two test sets were marked by three experts. Test set 1 is a well-constructed text from pain abstracts; test set 2 is actual medical reports. Results are reported as recall, precision, F-score and rate of valid terms in false positives. A limitation of the current research is the relatively small test set.
AB - We present a data-driven approach to extract the "most specific" terms relevant to an ontology of functioning, disability and health. The algorithm is a combination of statistical and linguistic approaches. The statistical filter is based on the frequency of the content words in a given text string; the linguistic heuristic is an extension of existing algorithms but goes beyond noun phrases and is formulated as a "complete syntactic node". Thus, it can be applied to any syntactic node of interest in the particular domain. Two test sets were marked by three experts. Test set 1 is a well-constructed text from pain abstracts; test set 2 is actual medical reports. Results are reported as recall, precision, F-score and rate of valid terms in false positives. A limitation of the current research is the relatively small test set.
UR - http://www.scopus.com/inward/record.url?scp=16544388629&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=16544388629&partnerID=8YFLogxK
M3 - Article
C2 - 14728239
AN - SCOPUS:16544388629
SN - 1559-4076
SP - 579
EP - 583
JO - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
JF - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
ER -