Classification of use status for dietary supplements in clinical notes

Yadan Fan, Lu He, Rui Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations


Clinical notes contain rich information about dietary supplements, which are critical for detecting signals of dietary supplement side effects and interactions between drugs and supplements. One of the important factors of supplement documentation is usage status, such as started and discontinuation. Such information is usually stored in the unstructured clinical notes. We developed a rule-based classifier to identify supplement usage status in clinical notes. The categories referring to the patient's status of supplement use were classified into four classes: Continuing (C), Discontinued (D), Started (S), and Unclassified (U). Clinical notes containing 10 of the most commonly consumed supplements (i.e., alfalfa, echinacea, fish oil, garlic, ginger, ginkgo, ginseng, melatonin, St. John's Wort, and Vitamin E) were retrieved from the University of Minnesota Clinical Data Repository. The gold standard was defined by manually annotating 1000 randomly selected sentences or statements mentioning at least one of these 10 supplements. The rules in the classifier was initially developed on two-thirds of the set of 7 supplements (i.e., alfalfa, garlic, ginger, ginkgo, ginseng, St. John's Wort, and Vitamin E); the performance was evaluated on the remaining one-third of this set. To evaluate the generalizability of rules, we further validated the second testing set on other 3 supplements (i.e., echinacea, fish oil, and melatonin). The performance of the classifier achieved F-measures of 0.95, 0.97, 0.96, and 0.96 for status C, D, S, and U on 7 supplements, respectively. The classifier also showed good generalizability when it was applied to the other 3 supplements with F-measures of 0.96 for C, 0.96 for D, 0.95 for S, and 0.89 for U. This study demonstrated that the classifier can accurately classify supplement usage status, which can be further integrated as a module into the existing natural language processing pipeline for supporting dietary supplement knowledge discovery.

Original languageEnglish (US)
Title of host publicationProceedings - 2016 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2016
EditorsKevin Burrage, Qian Zhu, Yunlong Liu, Tianhai Tian, Yadong Wang, Xiaohua Tony Hu, Qinghua Jiang, Jiangning Song, Shinichi Morishita, Kevin Burrage, Guohua Wang
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages8
ISBN (Electronic)9781509016105
StatePublished - Jan 17 2017
Event2016 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2016 - Shenzhen, China
Duration: Dec 15 2016Dec 18 2016

Publication series

NameProceedings - 2016 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2016


Other2016 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2016

Bibliographical note

Funding Information:
This study was sponsored by the University of Minnesota Grant-in-Aid award (RZ) and partly supported by National Center for Advancing Translational Sciences (ULI TROOOI14, Blazar)

Publisher Copyright:
© 2016 IEEE.


  • Clinical Notes
  • Electronic Health Records
  • Natural Language Processing
  • Regular Expression
  • Supplements Use Status


Dive into the research topics of 'Classification of use status for dietary supplements in clinical notes'. Together they form a unique fingerprint.

Cite this