Abstract
Causal inference aims to estimate the causal relationships and effect sizes among treatments and outcomes. Electronic health record (EHR) data is a valuable healthcare data source that can support causal inference. However, a large percentage of the data is missing in EHRs and they are missing not at random (MNAR). Ignoring MNAR can lead to severe biases, to the extent where the causal structure underlying the data gets distorted. We proposed a new causal inference methodology that addresses the MNAR problem and thus helps preserve the causal structure. We compared the performance of our proposed method with the traditional causal inference method, structural equation modeling (SEM). We evaluated these methods for their accuracy in estimating the causal effect sizes and their ability to converge at all. We employed both simulation studies and real-world EHR data sets. We demonstrated that imputation under the improper missingness mechanism distorted the causal structure to a degree where SEM found it incompatible with the data and failed to. converge. Even when 20 to 30 % of the values were missing, SEM failed to converge in as many as 50% of the runs. The proposed causal inference method achieved a higher convergence rate and more accurate estimation of latent treatment effects both on the synthetic data and on a real EHR data set. We proposed a new methodology that incorporates the knowledge of missing data mechanisms. It significantly mitigated the biases associated with MNAR in the EHR dataset and substantially outperformed SEM that uses the improper missing data mechanism.
Original language | English (US) |
---|---|
Title of host publication | Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 21-26 |
Number of pages | 6 |
ISBN (Electronic) | 9781665468459 |
DOIs | |
State | Published - 2022 |
Event | 10th IEEE International Conference on Healthcare Informatics, ICHI 2022 - Rochester, United States Duration: Jun 11 2022 → Jun 14 2022 |
Publication series
Name | Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022 |
---|
Conference
Conference | 10th IEEE International Conference on Healthcare Informatics, ICHI 2022 |
---|---|
Country/Territory | United States |
City | Rochester |
Period | 6/11/22 → 6/14/22 |
Bibliographical note
Funding Information:This work is supported by the National Institutes of Health (NIH) grants AG056366, TR002494, and LM011972.
Publisher Copyright:
© 2022 IEEE.
Keywords
- causal inference
- missing imputation
- MNAR