Discovering and identifying New York heart association classification from electronic health records

Rui Zhang, Sisi Ma, Liesa Shanahan, Jessica Munroe, Sarah Horn, Stuart M Speedie

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Background: Cardiac Resynchronization Therapy (CRT) is an established pacing therapy for heart failure patients. The New York Heart Association (NYHA) class is often used as a measure of a patient's response to CRT. Identifying NYHA class for heart failure (HF) patients in an electronic health record (EHR) consistently, over time, can provide better understanding of the progression of heart failure and assessment of CRT response and effectiveness. Though NYHA is rarely stored in EHR structured data, such information is often documented in unstructured clinical notes. Methods: We accessed HF patients' data in a local EHR system and identified potential sources of NYHA, including local diagnosis codes, procedures, and clinical notes. We further investigated and compared the performances of rule-based versus machine learning-based natural language processing (NLP) methods to identify NYHA class from clinical notes. Results: Of the 36,276 patients with a diagnosis of HF or a CRT implant, 19.2% had NYHA class mentioned at least once in their EHR. While NYHA class existed in descriptive fields association with diagnosis codes (31%) or procedure codes (2%), the richest source of NYHA class was clinical notes (95%). A total of 6174 clinical notes were matched with hospital-specific custom NYHA class diagnosis codes. Machine learning-based methods outperformed a rule-based method. The best machine-learning method was a random forest with n-gram features (F-measure: 93.78%). Conclusions: NYHA class is documented in different parts in EHR for HF patients and the documentation rate is lower than expected. NLP methods are a feasible way to extract NYHA class information from clinical notes.

Original languageEnglish (US)
Article number48
JournalBMC medical informatics and decision making
Volume18
DOIs
StatePublished - Jul 23 2018

Fingerprint

Electronic Health Records
Cardiac Resynchronization Therapy
Heart Failure
Natural Language Processing
Documentation

Keywords

  • Clinical notes
  • Electronic health records
  • Natural language processing
  • New York heart association (NYHA)

Cite this

Discovering and identifying New York heart association classification from electronic health records. / Zhang, Rui; Ma, Sisi; Shanahan, Liesa; Munroe, Jessica; Horn, Sarah; Speedie, Stuart M.

In: BMC medical informatics and decision making, Vol. 18, 48, 23.07.2018.

Research output: Contribution to journalArticle

@article{9a7c19d42859492a93100d0d1d0715cb,
title = "Discovering and identifying New York heart association classification from electronic health records",
abstract = "Background: Cardiac Resynchronization Therapy (CRT) is an established pacing therapy for heart failure patients. The New York Heart Association (NYHA) class is often used as a measure of a patient's response to CRT. Identifying NYHA class for heart failure (HF) patients in an electronic health record (EHR) consistently, over time, can provide better understanding of the progression of heart failure and assessment of CRT response and effectiveness. Though NYHA is rarely stored in EHR structured data, such information is often documented in unstructured clinical notes. Methods: We accessed HF patients' data in a local EHR system and identified potential sources of NYHA, including local diagnosis codes, procedures, and clinical notes. We further investigated and compared the performances of rule-based versus machine learning-based natural language processing (NLP) methods to identify NYHA class from clinical notes. Results: Of the 36,276 patients with a diagnosis of HF or a CRT implant, 19.2{\%} had NYHA class mentioned at least once in their EHR. While NYHA class existed in descriptive fields association with diagnosis codes (31{\%}) or procedure codes (2{\%}), the richest source of NYHA class was clinical notes (95{\%}). A total of 6174 clinical notes were matched with hospital-specific custom NYHA class diagnosis codes. Machine learning-based methods outperformed a rule-based method. The best machine-learning method was a random forest with n-gram features (F-measure: 93.78{\%}). Conclusions: NYHA class is documented in different parts in EHR for HF patients and the documentation rate is lower than expected. NLP methods are a feasible way to extract NYHA class information from clinical notes.",
keywords = "Clinical notes, Electronic health records, Natural language processing, New York heart association (NYHA)",
author = "Rui Zhang and Sisi Ma and Liesa Shanahan and Jessica Munroe and Sarah Horn and Speedie, {Stuart M}",
year = "2018",
month = "7",
day = "23",
doi = "10.1186/s12911-018-0625-7",
language = "English (US)",
volume = "18",
journal = "BMC Medical Informatics and Decision Making",
issn = "1472-6947",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Discovering and identifying New York heart association classification from electronic health records

AU - Zhang, Rui

AU - Ma, Sisi

AU - Shanahan, Liesa

AU - Munroe, Jessica

AU - Horn, Sarah

AU - Speedie, Stuart M

PY - 2018/7/23

Y1 - 2018/7/23

N2 - Background: Cardiac Resynchronization Therapy (CRT) is an established pacing therapy for heart failure patients. The New York Heart Association (NYHA) class is often used as a measure of a patient's response to CRT. Identifying NYHA class for heart failure (HF) patients in an electronic health record (EHR) consistently, over time, can provide better understanding of the progression of heart failure and assessment of CRT response and effectiveness. Though NYHA is rarely stored in EHR structured data, such information is often documented in unstructured clinical notes. Methods: We accessed HF patients' data in a local EHR system and identified potential sources of NYHA, including local diagnosis codes, procedures, and clinical notes. We further investigated and compared the performances of rule-based versus machine learning-based natural language processing (NLP) methods to identify NYHA class from clinical notes. Results: Of the 36,276 patients with a diagnosis of HF or a CRT implant, 19.2% had NYHA class mentioned at least once in their EHR. While NYHA class existed in descriptive fields association with diagnosis codes (31%) or procedure codes (2%), the richest source of NYHA class was clinical notes (95%). A total of 6174 clinical notes were matched with hospital-specific custom NYHA class diagnosis codes. Machine learning-based methods outperformed a rule-based method. The best machine-learning method was a random forest with n-gram features (F-measure: 93.78%). Conclusions: NYHA class is documented in different parts in EHR for HF patients and the documentation rate is lower than expected. NLP methods are a feasible way to extract NYHA class information from clinical notes.

AB - Background: Cardiac Resynchronization Therapy (CRT) is an established pacing therapy for heart failure patients. The New York Heart Association (NYHA) class is often used as a measure of a patient's response to CRT. Identifying NYHA class for heart failure (HF) patients in an electronic health record (EHR) consistently, over time, can provide better understanding of the progression of heart failure and assessment of CRT response and effectiveness. Though NYHA is rarely stored in EHR structured data, such information is often documented in unstructured clinical notes. Methods: We accessed HF patients' data in a local EHR system and identified potential sources of NYHA, including local diagnosis codes, procedures, and clinical notes. We further investigated and compared the performances of rule-based versus machine learning-based natural language processing (NLP) methods to identify NYHA class from clinical notes. Results: Of the 36,276 patients with a diagnosis of HF or a CRT implant, 19.2% had NYHA class mentioned at least once in their EHR. While NYHA class existed in descriptive fields association with diagnosis codes (31%) or procedure codes (2%), the richest source of NYHA class was clinical notes (95%). A total of 6174 clinical notes were matched with hospital-specific custom NYHA class diagnosis codes. Machine learning-based methods outperformed a rule-based method. The best machine-learning method was a random forest with n-gram features (F-measure: 93.78%). Conclusions: NYHA class is documented in different parts in EHR for HF patients and the documentation rate is lower than expected. NLP methods are a feasible way to extract NYHA class information from clinical notes.

KW - Clinical notes

KW - Electronic health records

KW - Natural language processing

KW - New York heart association (NYHA)

UR - http://www.scopus.com/inward/record.url?scp=85050823536&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85050823536&partnerID=8YFLogxK

U2 - 10.1186/s12911-018-0625-7

DO - 10.1186/s12911-018-0625-7

M3 - Article

VL - 18

JO - BMC Medical Informatics and Decision Making

JF - BMC Medical Informatics and Decision Making

SN - 1472-6947

M1 - 48

ER -