The continuously increasing cost of the US healthcare system has received significant attention. Central to the ideas aimed at curbing this trend is the use of technology in the form of the mandate to implement electronic health records (EHRs). EHRs consist of patient information such as demographics, medications, laboratory test results, diagnosis codes, and procedures. Mining EHRs could lead to improvement in patient health management as EHRs contain detailed information related to disease prognosis for large patient populations. In this article, we provide a structured and comprehensive overview of data mining techniques for modeling EHRs. We first provide a detailed understanding of the major application areas to which EHR mining has been applied and then discuss the nature of EHR data and its accompanying challenges. Next, we describe major approaches used for EHR mining, the metrics associated with EHRs, and the various study designs. With this foundation, we then provide a systematic and methodological organization of existing data mining techniques used to model EHRs and discuss ideas for future research.
Bibliographical noteFunding Information:
The work described in this manuscript was supported by NIH grant LM011972 and NSF grants IIS-1344135 and IIS-1602394. The views expressed in this manuscript are those of the authors and do not necessarily reflect the views of NIH and NSF.
The work described in this manuscript was supported by NIH grant LM011972 and NSF grants IIS-1344135 and IIS-1602394. The views expressed in this manuscript are those of the authors and do not necessarily reflect the views of NIH and NSF. Authors’ addresses: P. Yadav, M. Steinbach, V. Kumar, and G. Simon, 200 Union St SE, Minneapolis, MN 55455, USA. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from firstname.lastname@example.org. © 2018 ACM 0360-0300/2018/01-ART85 $15.00 https://doi.org/10.1145/3127881
© 2018 ACM.
- Data mining
- Healthcare analytics
- Healthcare informatics
- Machine learning