Using language models to identify relevant new information in inpatient clinical notes

Research output: Contribution to journalArticlepeer-review

11 Scopus citations


Redundant information in clinical notes within electronic health record (EHR) systems is ubiquitous and may negatively impact the use of these notes by clinicians, and, potentially, the efficiency of patient care delivery. Automated methods to identify redundant versus relevant new information may provide a valuable tool for clinicians to better synthesize patient information and navigate to clinically important details. In this study, we investigated the use of language models for identification of new information in inpatient notes, and evaluated our methods using expert-derived reference standards. The best method achieved precision of 0.743, recall of 0.832 and F1-measure of 0.784. The average proportion of redundant information was similar between inpatient and outpatient progress notes (76.6% (SD=17.3%) and 76.7% (SD=14.0%), respectively). Advanced practice providers tended to have higher rates of redundancy in their notes compared to physicians. Future investigation includes the addition of semantic components and visualization of new information.

Original languageEnglish (US)
Pages (from-to)1268-1276
Number of pages9
JournalAMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
StatePublished - 2014


Dive into the research topics of 'Using language models to identify relevant new information in inpatient clinical notes'. Together they form a unique fingerprint.

Cite this