Modeling Zero-Inflated and Overdispersed Count Data: An Empirical Study of School Suspensions

Christopher David Desjardins

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

The purpose of this article is to develop a statistical model that best explains variability in the number of school days suspended. Number of school days suspended is a count variable that may be zero-inflated and overdispersed relative to a Poisson model. Four models were examined: Poisson, negative binomial, Poisson hurdle, and negative binomial hurdle. Additionally, the probability of a student being suspended for at least 1 day was modeled using a binomial logistic regression model. Of the count models considered, the negative binomial hurdle model had the best fit. Modeling the probability of a student being suspended for at least 1 day using a binomial logistic regression model with interactions fit both the training and test data and had adequate fit. Findings here suggest that both the negative binomial hurdle and the binomial logistic regression models should be considered when modeling school suspensions.

Original languageEnglish (US)
Pages (from-to)449-472
Number of pages24
JournalJournal of Experimental Education
Volume84
Issue number3
DOIs
StatePublished - Jul 2 2016

Keywords

  • count data
  • hurdle
  • overdispersed
  • school suspensions
  • zero-inflated

Fingerprint Dive into the research topics of 'Modeling Zero-Inflated and Overdispersed Count Data: An Empirical Study of School Suspensions'. Together they form a unique fingerprint.

Cite this