TY - JOUR
T1 - Discovering and identifying New York heart association classification from electronic health records
AU - Zhang, Rui
AU - Ma, Sisi
AU - Shanahan, Liesa
AU - Munroe, Jessica
AU - Horn, Sarah
AU - Speedie, Stuart
N1 - Funding Information:
This research and publication of this article were supported by the Medtronic, Inc.
Publisher Copyright:
© 2018 The Author(s).
PY - 2018/7/23
Y1 - 2018/7/23
N2 - Background: Cardiac Resynchronization Therapy (CRT) is an established pacing therapy for heart failure patients. The New York Heart Association (NYHA) class is often used as a measure of a patient's response to CRT. Identifying NYHA class for heart failure (HF) patients in an electronic health record (EHR) consistently, over time, can provide better understanding of the progression of heart failure and assessment of CRT response and effectiveness. Though NYHA is rarely stored in EHR structured data, such information is often documented in unstructured clinical notes. Methods: We accessed HF patients' data in a local EHR system and identified potential sources of NYHA, including local diagnosis codes, procedures, and clinical notes. We further investigated and compared the performances of rule-based versus machine learning-based natural language processing (NLP) methods to identify NYHA class from clinical notes. Results: Of the 36,276 patients with a diagnosis of HF or a CRT implant, 19.2% had NYHA class mentioned at least once in their EHR. While NYHA class existed in descriptive fields association with diagnosis codes (31%) or procedure codes (2%), the richest source of NYHA class was clinical notes (95%). A total of 6174 clinical notes were matched with hospital-specific custom NYHA class diagnosis codes. Machine learning-based methods outperformed a rule-based method. The best machine-learning method was a random forest with n-gram features (F-measure: 93.78%). Conclusions: NYHA class is documented in different parts in EHR for HF patients and the documentation rate is lower than expected. NLP methods are a feasible way to extract NYHA class information from clinical notes.
AB - Background: Cardiac Resynchronization Therapy (CRT) is an established pacing therapy for heart failure patients. The New York Heart Association (NYHA) class is often used as a measure of a patient's response to CRT. Identifying NYHA class for heart failure (HF) patients in an electronic health record (EHR) consistently, over time, can provide better understanding of the progression of heart failure and assessment of CRT response and effectiveness. Though NYHA is rarely stored in EHR structured data, such information is often documented in unstructured clinical notes. Methods: We accessed HF patients' data in a local EHR system and identified potential sources of NYHA, including local diagnosis codes, procedures, and clinical notes. We further investigated and compared the performances of rule-based versus machine learning-based natural language processing (NLP) methods to identify NYHA class from clinical notes. Results: Of the 36,276 patients with a diagnosis of HF or a CRT implant, 19.2% had NYHA class mentioned at least once in their EHR. While NYHA class existed in descriptive fields association with diagnosis codes (31%) or procedure codes (2%), the richest source of NYHA class was clinical notes (95%). A total of 6174 clinical notes were matched with hospital-specific custom NYHA class diagnosis codes. Machine learning-based methods outperformed a rule-based method. The best machine-learning method was a random forest with n-gram features (F-measure: 93.78%). Conclusions: NYHA class is documented in different parts in EHR for HF patients and the documentation rate is lower than expected. NLP methods are a feasible way to extract NYHA class information from clinical notes.
KW - Clinical notes
KW - Electronic health records
KW - Natural language processing
KW - New York heart association (NYHA)
UR - http://www.scopus.com/inward/record.url?scp=85050823536&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85050823536&partnerID=8YFLogxK
U2 - 10.1186/s12911-018-0625-7
DO - 10.1186/s12911-018-0625-7
M3 - Article
C2 - 30066653
AN - SCOPUS:85050823536
SN - 1472-6947
VL - 18
JO - BMC medical informatics and decision making
JF - BMC medical informatics and decision making
M1 - 48
ER -