Evaluating the impact of data representation on EHR-based analytic tasks

Wonsuk Oh, Michael S. Steinbach, M. Regina Castro, Kevin A. Peterson, Vipin Kumar, Pedro J. Caraballo, Gyorgy J. Simona

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

Different analytic techniques operate optimally with different types of data. As the use of EHR-based analytics expands to newer tasks, data will have to be transformed into different representations, so the tasks can be optimally solved. We classified representations into broad categories based on their characteristics, and proposed a new knowledge-driven representation for clinical data mining as well as trajectory mining, called Severity Encoding Variables (SEVs). Additionally, we studied which characteristics make representations most suitable for particular clinical analytics tasks including trajectory mining. Our evaluation shows that, for regression, most data representations performed similarly, with SEV achieving a slight (albeit statistically significant) advantage. For patients at high risk of diabetes, it outperformed the competing representation by (relative) 20%. For association mining, SEV achieved the highest performance. Its ability to constrain the search space of patterns through clinical knowledge was key to its success.

Original languageEnglish (US)
Title of host publicationMEDINFO 2019
Subtitle of host publicationHealth and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics
EditorsBrigitte Seroussi, Lucila Ohno-Machado, Lucila Ohno-Machado, Brigitte Seroussi
PublisherIOS Press
Pages288-292
Number of pages5
Volume264
ISBN (Electronic)9781643680026
DOIs
StatePublished - Aug 21 2019
Event17th World Congress on Medical and Health Informatics, MEDINFO 2019 - Lyon, France
Duration: Aug 25 2019Aug 30 2019

Publication series

NameStudies in health technology and informatics
ISSN (Print)0926-9630

Conference

Conference17th World Congress on Medical and Health Informatics, MEDINFO 2019
Country/TerritoryFrance
CityLyon
Period8/25/198/30/19

Bibliographical note

Publisher Copyright:
© 2019 International Medical Informatics Association (IMIA) and IOS Press. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).

Keywords

  • Data Mining
  • Data Science
  • Electronic Health Records

PubMed: MeSH publication types

  • Journal Article

Fingerprint

Dive into the research topics of 'Evaluating the impact of data representation on EHR-based analytic tasks'. Together they form a unique fingerprint.

Cite this