Natural Language Processing (NLP) techniques have been used extensively to extract concepts from unstructured clinical trial eligibility criteria. Recruiting patients whose information in Electronic Health Records matches clinical trial eligibility criteria can potentially facilitate and accelerate the clinical trial recruitment process. However, a significant obstacle is identifying an efficient Named Entity Recognition (NER) system to parse the clinical trial eligibility criteria. In this study, we used NLP-ADAPT (Artifact Discovery and Preparation Toolkit) to compare existing biomedical NLP systems (BiomedICUS, CLAMP, cTAKES and MetaMap) and their Boolean ensemble to identify entities of the eligibility criteria of 150 randomly selected Dietary Supplement (DS) clinical trials. We created a custom mapping of the gold standard annotated entities to UMLS semantic types to align with annotations from each system. All systems in NLP-ADAPT used their default pipelines to extract entities based on our custom mappings. The systems performed reasonably well in extracting UMLS concepts belonging to the semantic types Disorders and Chemicals and Drugs. Among all systems, cTAKES was the highest performing system for Chemicals and Drugs and Disorders semantic groups and BioMedICUS was the highest performing system for Procedures, Living Beings, Concepts and Ideas, and Devices. Whereas, the Boolean ensemble outperformed individual systems. This study sets a baseline that can be potentially improved with modifications to the NLP-ADAPT pipeline.
|Original language||English (US)|
|Title of host publication||Artificial Intelligence in Medicine - 18th International Conference on Artificial Intelligence in Medicine, AIME 2020, Proceedings|
|Editors||Martin Michalowski, Robert Moskovitch|
|Publisher||Springer Science and Business Media Deutschland GmbH|
|Number of pages||11|
|State||Published - 2020|
|Event||18th International Conference on Artificial Intelligence in Medicine, AIME 2020 - Minneapolis, United States|
Duration: Aug 25 2020 → Aug 28 2020
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Conference||18th International Conference on Artificial Intelligence in Medicine, AIME 2020|
|Period||8/25/20 → 8/28/20|
Bibliographical noteFunding Information:
This work was partially supported by the NIH?s National Center for Complementary and Integrative Health and the Office of Dietary Supplements under grant number R01AT009457 (Zhang); and supported by the National Center for Advancing Translational Sciences under grant number UL1TR002494 and U01TR002062.
Acknowledgements. This work was partially supported by the NIH’s National Center for Complementary and Integrative Health and the Office of Dietary Supplements under grant number R01AT009457 (Zhang); and supported by the National Center for Advancing Translational Sciences under grant number UL1TR002494 and U01TR002062.
© 2020, Springer Nature Switzerland AG.
Copyright 2020 Elsevier B.V., All rights reserved.
- Clinical trial eligibility
- Named Entity Recognition
- Natural Language Processing