TY - JOUR
T1 - A comparison of evaluation metrics for biomedical journals, articles, and websites in terms of sensitivity to topic
AU - Fu, Lawrence D.
AU - Aphinyanaphongs Yindalon, Y.
AU - Wang, Lily
AU - Aliferis, Constantin F.
PY - 2011/8
Y1 - 2011/8
N2 - Evaluating the biomedical literature and health-related websites for quality are challenging information retrieval tasks. Current commonly used methods include impact factor for journals, PubMed's clinical query filters and machine learning-based filter models for articles, and PageRank for websites. Previous work has focused on the average performance of these methods without considering the topic, and it is unknown how performance varies for specific topics or focused searches. Clinicians, researchers, and users should be aware when expected performance is not achieved for specific topics. The present work analyzes the behavior of these methods for a variety of topics. Impact factor, clinical query filters, and PageRank vary widely across different topics while a topic-specific impact factor and machine learning-based filter models are more stable. The results demonstrate that a method may perform excellently on average but struggle when used on a number of narrower topics. Topic-adjusted metrics and other topic robust methods have an advantage in such situations. Users of traditional topic-sensitive metrics should be aware of their limitations.
AB - Evaluating the biomedical literature and health-related websites for quality are challenging information retrieval tasks. Current commonly used methods include impact factor for journals, PubMed's clinical query filters and machine learning-based filter models for articles, and PageRank for websites. Previous work has focused on the average performance of these methods without considering the topic, and it is unknown how performance varies for specific topics or focused searches. Clinicians, researchers, and users should be aware when expected performance is not achieved for specific topics. The present work analyzes the behavior of these methods for a variety of topics. Impact factor, clinical query filters, and PageRank vary widely across different topics while a topic-specific impact factor and machine learning-based filter models are more stable. The results demonstrate that a method may perform excellently on average but struggle when used on a number of narrower topics. Topic-adjusted metrics and other topic robust methods have an advantage in such situations. Users of traditional topic-sensitive metrics should be aware of their limitations.
KW - Bibliometrics
KW - Information retrieval
KW - Journal impact factor
KW - Machine learning
KW - PageRank
KW - Topic-sensitivity
UR - http://www.scopus.com/inward/record.url?scp=79960563989&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79960563989&partnerID=8YFLogxK
U2 - 10.1016/j.jbi.2011.03.006
DO - 10.1016/j.jbi.2011.03.006
M3 - Article
C2 - 21419864
AN - SCOPUS:79960563989
SN - 1532-0464
VL - 44
SP - 587
EP - 594
JO - Journal of Biomedical Informatics
JF - Journal of Biomedical Informatics
IS - 4
ER -