Abstract
With the advent of big data and the popularity of black-box deep learning methods, it is imperative to address the robustness of neural networks to noise and outliers. We propose the use of Winsorization to recover model performances when the data may have outliers and other aberrant observations. We provide a comparative analysis of several probabilistic artificial intelligence and machine learning techniques for supervised learning case studies. Broadly, Winsorization is a versatile technique for accounting for outliers in data. However, different probabilistic machine learning techniques have different levels of efficiency when used on outlier-prone data, with or without Winsorization. We notice that Gaussian processes are extremely vulnerable to outliers, while deep learning techniques in general are more robust.
Original language | English (US) |
---|---|
Article number | 1546 |
Journal | Entropy |
Volume | 23 |
Issue number | 11 |
DOIs | |
State | Published - Nov 2021 |
Bibliographical note
Funding Information:Funding: This research is partially supported by the US National Science Foundation (NSF) under grants 1737918, 1939916, 1939956, and a grant from Cisco Systems Inc.
Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.
Keywords
- Bayesian neural network
- Concrete dropout
- Flipout
- Mixture density networks
- Uncertainty quantification
- Variational Gaussian process
- Win-sorization