Combining Predictions of Auto Insurance Claims

Chenglong Ye, Lin Zhang, Mingxuan Han, Yanjia Yu, Bingxin Zhao, Yuhong Yang

Research output: Contribution to journalArticlepeer-review


This paper aims to better predict highly skewed auto insurance claims by combining candidate predictions. We analyze a version of the Kangaroo Auto Insurance company data and study the effects of combining different methods using five measures of prediction accuracy. The results show the following. First, when there is an outstanding (in terms of Gini Index) prediction among the candidates, the “forecast combination puzzle” phenomenon disappears. The simple average method performs much worse than the more sophisticated model combination methods, indicating that combining different methods could help us avoid performance degradation. Second, the choice of the prediction accuracy measure is crucial in defining the best candidate prediction for “low frequency and high severity” (LFHS) data. For example, mean square error (MSE) does not distinguish well between model combination methods, as the values are close. Third, the performances of different model combination methods can differ drastically. We propose using a new model combination method, named ARM-Tweedie, for such LFHS data; it benefits from an optimal rate of convergence and exhibits a desirable performance in several measures for the Kangaroo data. Fourth, overall, model combination methods improve the prediction accuracy for auto insurance claim costs. In particular, Adaptive Regression by Mixing (ARM), ARM-Tweedie, and constrained Linear Regression can improve forecast performance when there are only weak learners or when no dominant learner exists.

Original languageEnglish (US)
Article number19
Issue number2
StatePublished - Jun 2022

Bibliographical note

Publisher Copyright:
© 2022 by the authors. Licensee MDPI, Basel, Switzerland.


  • Tweedie distribution
  • auto insurance
  • claim cost prediction
  • model averaging
  • normalized Gini index


Dive into the research topics of 'Combining Predictions of Auto Insurance Claims'. Together they form a unique fingerprint.

Cite this