TY - JOUR
T1 - Strategies for handling missing clinical data for automated surgical site infection detection from the electronic health record
AU - Hu, Zhen
AU - Melton, Genevieve B.
AU - Arsoniadis, Elliot G.
AU - Wang, Yan
AU - Kwaan, Mary R.
AU - Simon, Gyorgy J.
N1 - Publisher Copyright:
© 2017
PY - 2017/4/1
Y1 - 2017/4/1
N2 - Proper handling of missing data is important for many secondary uses of electronic health record (EHR) data. Data imputation methods can be used to handle missing data, but their use for analyzing EHR data is limited and specific efficacy for postoperative complication detection is unclear. Several data imputation methods were used to develop data models for automated detection of three types (i.e., superficial, deep, and organ space) of surgical site infection (SSI) and overall SSI using American College of Surgeons National Surgical Quality Improvement Project (NSQIP) Registry 30-day SSI occurrence data as a reference standard. Overall, models with missing data imputation almost always outperformed reference models without imputation that included only cases with complete data for detection of SSI overall achieving very good average area under the curve values. Missing data imputation appears to be an effective means for improving postoperative SSI detection using EHR clinical data.
AB - Proper handling of missing data is important for many secondary uses of electronic health record (EHR) data. Data imputation methods can be used to handle missing data, but their use for analyzing EHR data is limited and specific efficacy for postoperative complication detection is unclear. Several data imputation methods were used to develop data models for automated detection of three types (i.e., superficial, deep, and organ space) of surgical site infection (SSI) and overall SSI using American College of Surgeons National Surgical Quality Improvement Project (NSQIP) Registry 30-day SSI occurrence data as a reference standard. Overall, models with missing data imputation almost always outperformed reference models without imputation that included only cases with complete data for detection of SSI overall achieving very good average area under the curve values. Missing data imputation appears to be an effective means for improving postoperative SSI detection using EHR clinical data.
KW - Electronic health records
KW - Missing data
KW - Surgical site infections
UR - http://www.scopus.com/inward/record.url?scp=85016029838&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85016029838&partnerID=8YFLogxK
U2 - 10.1016/j.jbi.2017.03.009
DO - 10.1016/j.jbi.2017.03.009
M3 - Article
C2 - 28323112
AN - SCOPUS:85016029838
SN - 1532-0464
VL - 68
SP - 112
EP - 120
JO - Journal of Biomedical Informatics
JF - Journal of Biomedical Informatics
ER -