Proper handling of missing data is important for many secondary uses of electronic health record (EHR) data. Data imputation methods can be used to handle missing data, but their use for analyzing EHR data is limited and specific efficacy for postoperative complication detection is unclear. Several data imputation methods were used to develop data models for automated detection of three types (i.e., superficial, deep, and organ space) of surgical site infection (SSI) and overall SSI using American College of Surgeons National Surgical Quality Improvement Project (NSQIP) Registry 30-day SSI occurrence data as a reference standard. Overall, models with missing data imputation almost always outperformed reference models without imputation that included only cases with complete data for detection of SSI overall achieving very good average area under the curve values. Missing data imputation appears to be an effective means for improving postoperative SSI detection using EHR clinical data.
Bibliographical noteFunding Information:
This research was supported by the University of Minnesota Academic Health Center Faculty Development Award (GS, GM), the American Surgical Association Foundation (GM), the Agency for Healthcare Research and Quality (R01HS24532-01A1),?and the National Institutes of Health (NIH) Clinical and Translational Science Award (CTSA) program (8UL1TR000114-02). The authors also thank Fairview Health Services for support of this research.
- Electronic health records
- Missing data
- Surgical site infections