Impact of density of lab data in EHR for prediction of potentially preventable events

Chandrima Sarkar, Jaideep Srivastava

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

This paper presents an analysis of sparse and incomplete Electronic Health Record (EHR) data for the prediction of patients with the risk of Potentially Preventable Events (PPEs). PPEs are admissions, readmissions, complications and emergency department visits that could have been avoided if the patient had been given the appropriate interventions. Machine learning techniques have made the task of PPE detection less difficult. However, it is still a challenging task due to the sparse and incomplete nature of the EHR data. It is therefore important to investigate the factors that impact the prediction of PPE in EHR data. In this paper we define the term density for evaluating sparse and incomplete nature of the EHR data set. We analyze three important factors that impacts PPE prediction in sparse and incomplete EHR data. These factors are - 1) Effect of varying domain information in the patient records on PPE prediction, 2) Impact of a popular data mining pre-processing technique known as rank aggregation based feature selection on PPE prediction, and 3) Effect of ensemble classification on prediction of PPE. The results of the analysis indicate that the rank aggregation based feature selection technique and ensemble classification improves classification accuracy by approximately 3-4\% despite of the sparse and incomplete nature of the data. However, eliminating patient records with less domain information, in order to reduce incompleteness in the data, does not cause an enhancement in the classification accuracy. We conclude that feature selection and ensemble classification techniques are important factors that affect classification accuracy even in sparse and incomplete data sets. We conclude as well that randomly decreasing domain information by varying lab values does not assist in increasing accuracy for the prediction of PPE.

Original languageEnglish (US)
Title of host publicationProceedings - 2013 IEEE International Conference on Healthcare Informatics, ICHI 2013
Pages529-534
Number of pages6
DOIs
StatePublished - Dec 1 2013
Event2013 1st IEEE International Conference on Healthcare Informatics, ICHI 2013 - Philadelphia, PA, United States
Duration: Sep 9 2013Sep 11 2013

Publication series

NameProceedings - 2013 IEEE International Conference on Healthcare Informatics, ICHI 2013

Other

Other2013 1st IEEE International Conference on Healthcare Informatics, ICHI 2013
CountryUnited States
CityPhiladelphia, PA
Period9/9/139/11/13

Keywords

  • Domain information
  • Ensemble classification
  • Feature selection
  • Potentially preventable events
  • Sparse and incomplete data

Fingerprint Dive into the research topics of 'Impact of density of lab data in EHR for prediction of potentially preventable events'. Together they form a unique fingerprint.

  • Cite this

    Sarkar, C., & Srivastava, J. (2013). Impact of density of lab data in EHR for prediction of potentially preventable events. In Proceedings - 2013 IEEE International Conference on Healthcare Informatics, ICHI 2013 (pp. 529-534). [6680530] (Proceedings - 2013 IEEE International Conference on Healthcare Informatics, ICHI 2013). https://doi.org/10.1109/ICHI.2013.82