Machine Learning Methods for Identifying Critical Data Elements in Nursing Documentation

Eliezer Bose, Sasank Maganti, Kathryn H. Bowles, Bonnie L. Brueshoff, Karen A Monsen

Research output: Contribution to journalArticle

Abstract

Background Public health nurses (PHNs) engage in home visiting services and documentation of care services for at-risk clients. To increase efficiency and decrease documentation burden, it would be useful for PHNs to identify critical data elements most associated with patient care priorities and outcomes. Machine learning techniques can aid in retrospective identification of critical data elements. Objective We used two different machine learning feature selection techniques of minimum redundancy-maximum relevance (mRMR) and LASSO (least absolute shrinkage and selection operator) and elastic net regularized generalized linear model (glmnet in R). Methods We demonstrated application of these techniques on the Omaha System database of 205 data elements (features) with a cohort of 756 family home visiting clients who received at least one visit from PHNs in a local Midwest public health agency. A dichotomous maternal risk index served as the outcome for feature selection. Application Using mRMR as a feature selection technique, out of 206 features, 50 features were selected with scores greater than zero, and generalized linear model applied on the 50 features achieved highest accuracy of 86.2% on a held-out test set. Using glmnet as a feature selection technique and obtaining feature importance, 63 features had importance scores greater than zero, and generalized linear model applied on them achieved the highest accuracy of 95.5% on a held-out test set. Discussion Feature selection techniques show promise toward reducing public health nursing documentation burden by identifying the most critical data elements needed to predict risk status. Further studies to refine the process of feature selection can aid in informing PHNs' focus on client-specific and targeted interventions in the delivery of care.

Original languageEnglish (US)
Article number00315
Pages (from-to)65-72
Number of pages8
JournalNursing research
Volume68
Issue number1
DOIs
StatePublished - Jan 1 2019

Fingerprint

Public Health Nurses
Documentation
Nursing
Linear Models
Public Health Nursing
Patient Care
Public Health
Mothers
Databases
Machine Learning

Keywords

  • Omaha System
  • machine learning
  • nursing informatics
  • public health nursing

PubMed: MeSH publication types

  • Journal Article
  • Research Support, Non-U.S. Gov't

Cite this

Machine Learning Methods for Identifying Critical Data Elements in Nursing Documentation. / Bose, Eliezer; Maganti, Sasank; Bowles, Kathryn H.; Brueshoff, Bonnie L.; Monsen, Karen A.

In: Nursing research, Vol. 68, No. 1, 00315, 01.01.2019, p. 65-72.

Research output: Contribution to journalArticle

Bose, Eliezer ; Maganti, Sasank ; Bowles, Kathryn H. ; Brueshoff, Bonnie L. ; Monsen, Karen A. / Machine Learning Methods for Identifying Critical Data Elements in Nursing Documentation. In: Nursing research. 2019 ; Vol. 68, No. 1. pp. 65-72.
@article{4bf7747538da40e189034091efa8337f,
title = "Machine Learning Methods for Identifying Critical Data Elements in Nursing Documentation",
abstract = "Background Public health nurses (PHNs) engage in home visiting services and documentation of care services for at-risk clients. To increase efficiency and decrease documentation burden, it would be useful for PHNs to identify critical data elements most associated with patient care priorities and outcomes. Machine learning techniques can aid in retrospective identification of critical data elements. Objective We used two different machine learning feature selection techniques of minimum redundancy-maximum relevance (mRMR) and LASSO (least absolute shrinkage and selection operator) and elastic net regularized generalized linear model (glmnet in R). Methods We demonstrated application of these techniques on the Omaha System database of 205 data elements (features) with a cohort of 756 family home visiting clients who received at least one visit from PHNs in a local Midwest public health agency. A dichotomous maternal risk index served as the outcome for feature selection. Application Using mRMR as a feature selection technique, out of 206 features, 50 features were selected with scores greater than zero, and generalized linear model applied on the 50 features achieved highest accuracy of 86.2{\%} on a held-out test set. Using glmnet as a feature selection technique and obtaining feature importance, 63 features had importance scores greater than zero, and generalized linear model applied on them achieved the highest accuracy of 95.5{\%} on a held-out test set. Discussion Feature selection techniques show promise toward reducing public health nursing documentation burden by identifying the most critical data elements needed to predict risk status. Further studies to refine the process of feature selection can aid in informing PHNs' focus on client-specific and targeted interventions in the delivery of care.",
keywords = "Omaha System, machine learning, nursing informatics, public health nursing",
author = "Eliezer Bose and Sasank Maganti and Bowles, {Kathryn H.} and Brueshoff, {Bonnie L.} and Monsen, {Karen A}",
year = "2019",
month = "1",
day = "1",
doi = "10.1097/NNR.0000000000000315",
language = "English (US)",
volume = "68",
pages = "65--72",
journal = "Nursing Research",
issn = "0029-6562",
publisher = "Lippincott Williams and Wilkins",
number = "1",

}

TY - JOUR

T1 - Machine Learning Methods for Identifying Critical Data Elements in Nursing Documentation

AU - Bose, Eliezer

AU - Maganti, Sasank

AU - Bowles, Kathryn H.

AU - Brueshoff, Bonnie L.

AU - Monsen, Karen A

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Background Public health nurses (PHNs) engage in home visiting services and documentation of care services for at-risk clients. To increase efficiency and decrease documentation burden, it would be useful for PHNs to identify critical data elements most associated with patient care priorities and outcomes. Machine learning techniques can aid in retrospective identification of critical data elements. Objective We used two different machine learning feature selection techniques of minimum redundancy-maximum relevance (mRMR) and LASSO (least absolute shrinkage and selection operator) and elastic net regularized generalized linear model (glmnet in R). Methods We demonstrated application of these techniques on the Omaha System database of 205 data elements (features) with a cohort of 756 family home visiting clients who received at least one visit from PHNs in a local Midwest public health agency. A dichotomous maternal risk index served as the outcome for feature selection. Application Using mRMR as a feature selection technique, out of 206 features, 50 features were selected with scores greater than zero, and generalized linear model applied on the 50 features achieved highest accuracy of 86.2% on a held-out test set. Using glmnet as a feature selection technique and obtaining feature importance, 63 features had importance scores greater than zero, and generalized linear model applied on them achieved the highest accuracy of 95.5% on a held-out test set. Discussion Feature selection techniques show promise toward reducing public health nursing documentation burden by identifying the most critical data elements needed to predict risk status. Further studies to refine the process of feature selection can aid in informing PHNs' focus on client-specific and targeted interventions in the delivery of care.

AB - Background Public health nurses (PHNs) engage in home visiting services and documentation of care services for at-risk clients. To increase efficiency and decrease documentation burden, it would be useful for PHNs to identify critical data elements most associated with patient care priorities and outcomes. Machine learning techniques can aid in retrospective identification of critical data elements. Objective We used two different machine learning feature selection techniques of minimum redundancy-maximum relevance (mRMR) and LASSO (least absolute shrinkage and selection operator) and elastic net regularized generalized linear model (glmnet in R). Methods We demonstrated application of these techniques on the Omaha System database of 205 data elements (features) with a cohort of 756 family home visiting clients who received at least one visit from PHNs in a local Midwest public health agency. A dichotomous maternal risk index served as the outcome for feature selection. Application Using mRMR as a feature selection technique, out of 206 features, 50 features were selected with scores greater than zero, and generalized linear model applied on the 50 features achieved highest accuracy of 86.2% on a held-out test set. Using glmnet as a feature selection technique and obtaining feature importance, 63 features had importance scores greater than zero, and generalized linear model applied on them achieved the highest accuracy of 95.5% on a held-out test set. Discussion Feature selection techniques show promise toward reducing public health nursing documentation burden by identifying the most critical data elements needed to predict risk status. Further studies to refine the process of feature selection can aid in informing PHNs' focus on client-specific and targeted interventions in the delivery of care.

KW - Omaha System

KW - machine learning

KW - nursing informatics

KW - public health nursing

UR - http://www.scopus.com/inward/record.url?scp=85058611616&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85058611616&partnerID=8YFLogxK

U2 - 10.1097/NNR.0000000000000315

DO - 10.1097/NNR.0000000000000315

M3 - Article

C2 - 30153212

AN - SCOPUS:85058611616

VL - 68

SP - 65

EP - 72

JO - Nursing Research

JF - Nursing Research

SN - 0029-6562

IS - 1

M1 - 00315

ER -