TY - JOUR
T1 - Using natural language processing methods to classify use status of dietary supplements in clinical notes
AU - Fan, Yadan
AU - Zhang, Rui
N1 - Publisher Copyright:
© 2018 The Author(s).
PY - 2018/7/23
Y1 - 2018/7/23
N2 - Background: Despite widespread use, the safety of dietary supplements is open to doubt due to the fact that they can interact with prescribed medications, leading to dangerous clinical outcomes. Electronic health records (EHRs) provide a potential way for active pharmacovigilance on dietary supplements since a fair amount of dietary supplement information, especially those on use status, can be found in clinical notes. Extracting such information is extremely significant for subsequent supplement safety research. Methods: In this study, we collected 2500 sentences for 25 commonly used dietary supplements and annotated into four classes: Continuing (C), Discontinued (D), Started (S) and Unclassified (U). Both rule-based and machine learning-based classifiers were developed on the same training set and evaluated using the hold-out test set. The performances of the two classifiers were also compared. Results: The rule-based classifier achieved F-measure of 0.90, 0.85, 0.90, and 0.86 in C, D, S, and U status, respectively. The optimal machine learning-based classifier (Maximum Entropy) achieved F-measure of 0.90, 0.92, 0.91 and 0.88 in C, D, S, and U status, respectively. The comparison result shows that the machine learning-based classifier has a better performance, which is more efficient and scalable especially when the sample size doubles. Conclusions: Machine learning-based classifier outperforms rule-based classifier in categorization of the use status of dietary supplements in clinical notes. Future work includes applying deep learning methods and developing a hybrid system to approach use status classification task.
AB - Background: Despite widespread use, the safety of dietary supplements is open to doubt due to the fact that they can interact with prescribed medications, leading to dangerous clinical outcomes. Electronic health records (EHRs) provide a potential way for active pharmacovigilance on dietary supplements since a fair amount of dietary supplement information, especially those on use status, can be found in clinical notes. Extracting such information is extremely significant for subsequent supplement safety research. Methods: In this study, we collected 2500 sentences for 25 commonly used dietary supplements and annotated into four classes: Continuing (C), Discontinued (D), Started (S) and Unclassified (U). Both rule-based and machine learning-based classifiers were developed on the same training set and evaluated using the hold-out test set. The performances of the two classifiers were also compared. Results: The rule-based classifier achieved F-measure of 0.90, 0.85, 0.90, and 0.86 in C, D, S, and U status, respectively. The optimal machine learning-based classifier (Maximum Entropy) achieved F-measure of 0.90, 0.92, 0.91 and 0.88 in C, D, S, and U status, respectively. The comparison result shows that the machine learning-based classifier has a better performance, which is more efficient and scalable especially when the sample size doubles. Conclusions: Machine learning-based classifier outperforms rule-based classifier in categorization of the use status of dietary supplements in clinical notes. Future work includes applying deep learning methods and developing a hybrid system to approach use status classification task.
KW - Clinical notes
KW - Dietary supplements
KW - Machine learning-based classification
KW - Natural language processing
KW - Rule-based method
KW - Use status
UR - http://www.scopus.com/inward/record.url?scp=85050810293&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85050810293&partnerID=8YFLogxK
U2 - 10.1186/s12911-018-0626-6
DO - 10.1186/s12911-018-0626-6
M3 - Article
C2 - 30066648
AN - SCOPUS:85050810293
SN - 1472-6947
VL - 18
JO - BMC medical informatics and decision making
JF - BMC medical informatics and decision making
M1 - 51
ER -