Abstract
With the popularity of big data analysis with insurance claim count data, diverse regression models for count response variable have been developed. However, there is a multicollinearlity issue with multivariate input variables to the count response regression models. Recently, deep learning and neural network models for count response have been proposed, and a Keras and Tensorflow-based deep learning model has been also proposed. To apply the deep learning and neural network models to non-normal insurance claim count data, we perform the root mean square error accuracy comparison of gradient boosting machines (a popular machine learning regression tree algorithm), principal component analysis (PCA)-based Poisson regression, PCA-based negative binomial regression, and PCA-based zero inflated poisson regression to avoid the multicollinearity of multivariate input variables with the simulated normal distribution data and the non-normal simulated data combined with normally distributed data, binary data, copula-based asymmetrical data, and two real data sets, which consist of speeding ticket and Singapore insurance claim count data.
| Original language | English (US) |
|---|---|
| Article number | 280 |
| Journal | Axioms |
| Volume | 11 |
| Issue number | 6 |
| DOIs | |
| State | Published - Jun 2022 |
Bibliographical note
Funding Information:Funding: National Research Foundation of Korea (NRF) funded by the Ministry of Education (No. NRF-2020R1F1A1A01056987).
Publisher Copyright:
© 2022 by the authors. Licensee MDPI, Basel, Switzerland.
Keywords
- deep learning
- negative binomial regression
- poisson
- zero inflated poisson