TY - JOUR
T1 - Evaluating semantic relatedness and similarity measures with Standardized MedDRA Queries.
AU - Bill, Robert W.
AU - Liu, Ying
AU - McInnes, Bridget T.
AU - Melton-Meaux, Genevieve B
AU - Pedersen, Ted
AU - Pakhomov, Serguei V
PY - 2012
Y1 - 2012
N2 - A potential use of automated concept similarity and relatedness measures is to improve automatic detection of clinical text that relates to a condition indicative of an adverse drug reaction. This is also one of the purposes of the Medical Dictionary for Regulatory Activities (MedDRA) Standardized Queries (SMQ). An expert panel evaluates SMQs for their ability to detect a condition of interest and thus qualifies them as a reference standard for evaluating automated approaches. We compare similarity and relatedness measurement methods on rates of correctly identifying intra-category and inter-category concept pairs from SMQ data to create ROC curves of each method's sensitivity and specificity. Results indicate an information content measure, specifically the Resnik method, achieved the highest results as measured by area under the curve, but using two different measures as predictors, Resnik and Lin, obtained the highest score. Overall, using SMQ data resulted in a productive method of evaluating automated semantic relatedness and similarity scores.
AB - A potential use of automated concept similarity and relatedness measures is to improve automatic detection of clinical text that relates to a condition indicative of an adverse drug reaction. This is also one of the purposes of the Medical Dictionary for Regulatory Activities (MedDRA) Standardized Queries (SMQ). An expert panel evaluates SMQs for their ability to detect a condition of interest and thus qualifies them as a reference standard for evaluating automated approaches. We compare similarity and relatedness measurement methods on rates of correctly identifying intra-category and inter-category concept pairs from SMQ data to create ROC curves of each method's sensitivity and specificity. Results indicate an information content measure, specifically the Resnik method, achieved the highest results as measured by area under the curve, but using two different measures as predictors, Resnik and Lin, obtained the highest score. Overall, using SMQ data resulted in a productive method of evaluating automated semantic relatedness and similarity scores.
UR - http://www.scopus.com/inward/record.url?scp=84880828010&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84880828010&partnerID=8YFLogxK
M3 - Article
C2 - 23304271
AN - SCOPUS:84880828010
SN - 0022-1120
VL - 2012
SP - 43
EP - 50
JO - Journal of Fluid Mechanics
JF - Journal of Fluid Mechanics
ER -