Abstract
Large observational data networks that leverage routine clinical practice data in electronic health records (EHRs) are critical resources for research on coronavirus disease 2019 (COVID-19). Data normalization is a key challenge for the secondary use of EHRs for COVID-19 research across institutions. In this study, we addressed the challenge of automating the normalization of COVID-19 diagnostic tests, which are critical data elements, but for which controlled terminology terms were published after clinical implementation. We developed a simple but effective rule-based tool called COVID-19 TestNorm to automatically normalize local COVID-19 testing names to standard LOINC (Logical Observation Identifiers Names and Codes) codes. COVID-19 TestNorm was developed and evaluated using 568 test names collected from 8 healthcare systems. Our results show that it could achieve an accuracy of 97.4% on an independent test set. COVID-19 TestNorm is available as an open-source package for developers and as an online Web application for end users (https://clamp.uth.edu/covid/loinc.php). We believe that it will be a useful tool to support secondary use of EHRs for research on COVID-19.
Original language | English (US) |
---|---|
Pages (from-to) | 1437-1442 |
Number of pages | 6 |
Journal | Journal of the American Medical Informatics Association |
Volume | 27 |
Issue number | 9 |
DOIs | |
State | Published - Sep 1 2020 |
Bibliographical note
Funding Information:This project is partially supported by National Center for Advancing Translational Sciences grant nos. UL1TR003167 (UTHealth CTSA, HX, and RM) and 5U01TR002062 (HL, XJ, SP, HX, and KN); National Cancer Institute grant no. U24 CA194215 (HX and HL); National Institute of Health grant nos. R01AG066749 (XJ), R01GM114612 (XJ) and U01TR002062 (XJ); Veterans Affairs Health Services Research RES 13-457 (MM); Cancer Prevention and Research Institute of Texas grant nos. RP170668 (HX), RR180012 (XJ) and RP160015 (XD); the National Science Foundation RAPID grant no. 2027790 (XJ); and Gordon and Betty Moore Foundation grant no. 9639 (LO-M and HX). The content is solely the responsibility of the authors and does not necessarily represent the official views of the Cancer Prevention and Research Institute of Texas, the Department of Veterans Affairs, or the U.S. government.
Publisher Copyright:
© The Author(s) 2020.
Keywords
- COVID-19
- COVID-19 TestNorm
- LOINC
- Natural language processing
- Testing name normalization