The use of count data models in biomedical informatics evaluation research

Jing Du, Young Taek Park, Nawanan Theera-Ampornpunt, Jeffrey S McCullough, Stuart M Speedie

Research output: Contribution to journalReview article

6 Citations (Scopus)

Abstract

Objectives: Studies on the impact and value of health information technology (HIT) have often focused on outcome measures that are counts of such things as hospital admissions or the number of laboratory tests per patient. These measures with their highly skewed distributions (high frequency of 0s and 1s) are more appropriately analyzed with count data models than the much more frequently used variations of ordinary least squares (OLS). Use of a statistical procedure that does not properly fit the distribution of the data can result in significant findings being overlooked. The objective of this paper is to encourage greater use of count data models by demonstrating their utility with an example based on the authors' current work. Target audience: Researchers conducting impact and outcome studies related to HIT. Scope: We review and discuss count data models and illustrate their value in comparison to OLS using an example from a study of the impact of an electronic health record (EHR) on laboratory test orders. The best count data model reveals significant relationships that OLS does not detect. We conclude that comprehensive model checking is highly recommended to identify the most appropriate analytic model when the dependent variable being examined contains count data. This strategy can lead to more valid and precise findings in HIT evaluation studies.

Original languageEnglish (US)
Pages (from-to)39-44
Number of pages6
JournalJournal of the American Medical Informatics Association
Volume19
Issue number1
DOIs
StatePublished - Jan 1 2012

Fingerprint

Medical Informatics
Informatics
Least-Squares Analysis
Outcome Assessment (Health Care)
Electronic Health Records
Research Personnel

Cite this

The use of count data models in biomedical informatics evaluation research. / Du, Jing; Park, Young Taek; Theera-Ampornpunt, Nawanan; McCullough, Jeffrey S; Speedie, Stuart M.

In: Journal of the American Medical Informatics Association, Vol. 19, No. 1, 01.01.2012, p. 39-44.

Research output: Contribution to journalReview article

Du, Jing ; Park, Young Taek ; Theera-Ampornpunt, Nawanan ; McCullough, Jeffrey S ; Speedie, Stuart M. / The use of count data models in biomedical informatics evaluation research. In: Journal of the American Medical Informatics Association. 2012 ; Vol. 19, No. 1. pp. 39-44.
@article{5440933fbc2a42e9b4ca2cf2bcc17bbc,
title = "The use of count data models in biomedical informatics evaluation research",
abstract = "Objectives: Studies on the impact and value of health information technology (HIT) have often focused on outcome measures that are counts of such things as hospital admissions or the number of laboratory tests per patient. These measures with their highly skewed distributions (high frequency of 0s and 1s) are more appropriately analyzed with count data models than the much more frequently used variations of ordinary least squares (OLS). Use of a statistical procedure that does not properly fit the distribution of the data can result in significant findings being overlooked. The objective of this paper is to encourage greater use of count data models by demonstrating their utility with an example based on the authors' current work. Target audience: Researchers conducting impact and outcome studies related to HIT. Scope: We review and discuss count data models and illustrate their value in comparison to OLS using an example from a study of the impact of an electronic health record (EHR) on laboratory test orders. The best count data model reveals significant relationships that OLS does not detect. We conclude that comprehensive model checking is highly recommended to identify the most appropriate analytic model when the dependent variable being examined contains count data. This strategy can lead to more valid and precise findings in HIT evaluation studies.",
author = "Jing Du and Park, {Young Taek} and Nawanan Theera-Ampornpunt and McCullough, {Jeffrey S} and Speedie, {Stuart M}",
year = "2012",
month = "1",
day = "1",
doi = "10.1136/amiajnl-2011-000256",
language = "English (US)",
volume = "19",
pages = "39--44",
journal = "Journal of the American Medical Informatics Association : JAMIA",
issn = "1067-5027",
publisher = "Oxford University Press",
number = "1",

}

TY - JOUR

T1 - The use of count data models in biomedical informatics evaluation research

AU - Du, Jing

AU - Park, Young Taek

AU - Theera-Ampornpunt, Nawanan

AU - McCullough, Jeffrey S

AU - Speedie, Stuart M

PY - 2012/1/1

Y1 - 2012/1/1

N2 - Objectives: Studies on the impact and value of health information technology (HIT) have often focused on outcome measures that are counts of such things as hospital admissions or the number of laboratory tests per patient. These measures with their highly skewed distributions (high frequency of 0s and 1s) are more appropriately analyzed with count data models than the much more frequently used variations of ordinary least squares (OLS). Use of a statistical procedure that does not properly fit the distribution of the data can result in significant findings being overlooked. The objective of this paper is to encourage greater use of count data models by demonstrating their utility with an example based on the authors' current work. Target audience: Researchers conducting impact and outcome studies related to HIT. Scope: We review and discuss count data models and illustrate their value in comparison to OLS using an example from a study of the impact of an electronic health record (EHR) on laboratory test orders. The best count data model reveals significant relationships that OLS does not detect. We conclude that comprehensive model checking is highly recommended to identify the most appropriate analytic model when the dependent variable being examined contains count data. This strategy can lead to more valid and precise findings in HIT evaluation studies.

AB - Objectives: Studies on the impact and value of health information technology (HIT) have often focused on outcome measures that are counts of such things as hospital admissions or the number of laboratory tests per patient. These measures with their highly skewed distributions (high frequency of 0s and 1s) are more appropriately analyzed with count data models than the much more frequently used variations of ordinary least squares (OLS). Use of a statistical procedure that does not properly fit the distribution of the data can result in significant findings being overlooked. The objective of this paper is to encourage greater use of count data models by demonstrating their utility with an example based on the authors' current work. Target audience: Researchers conducting impact and outcome studies related to HIT. Scope: We review and discuss count data models and illustrate their value in comparison to OLS using an example from a study of the impact of an electronic health record (EHR) on laboratory test orders. The best count data model reveals significant relationships that OLS does not detect. We conclude that comprehensive model checking is highly recommended to identify the most appropriate analytic model when the dependent variable being examined contains count data. This strategy can lead to more valid and precise findings in HIT evaluation studies.

UR - http://www.scopus.com/inward/record.url?scp=84863078674&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863078674&partnerID=8YFLogxK

U2 - 10.1136/amiajnl-2011-000256

DO - 10.1136/amiajnl-2011-000256

M3 - Review article

C2 - 21715429

AN - SCOPUS:84863078674

VL - 19

SP - 39

EP - 44

JO - Journal of the American Medical Informatics Association : JAMIA

JF - Journal of the American Medical Informatics Association : JAMIA

SN - 1067-5027

IS - 1

ER -