Abstract
Understanding the trustworthiness of a prediction yielded by a classifier is critical for the safe and effective use of AI models. Prior efforts have been proven to be reliable on small-scale datasets. In this work, we study the problem of predicting trustworthiness on real-world large-scale datasets, where the task is more challenging due to high-dimensional features, diverse visual concepts, and a large number of samples. In such a setting, we observe that the trustworthiness predictors trained with prior-art loss functions, i.e., the cross entropy loss, focal loss, and true class probability confidence loss, are prone to view both correct predictions and incorrect predictions to be trustworthy. The reasons are two-fold. Firstly, correct predictions are generally dominant over incorrect predictions. Secondly, due to the data complexity, it is challenging to differentiate the incorrect predictions from the correct ones on real-world large-scale datasets. To improve the generalizability of trustworthiness predictors, we propose a novel steep slope loss to separate the features w.r.t. correct predictions from the ones w.r.t. incorrect predictions by two slide-like curves that oppose each other. The proposed loss is evaluated with two representative deep learning models, i.e., Vision Transformer and ResNet, as trustworthiness predictors. We conduct comprehensive experiments and analyses on ImageNet, which show that the proposed loss effectively improves the generalizability of trustworthiness predictors. The code and pre-trained trustworthiness predictors for reproducibility are available at https://github.com/luoyan407/predict_trustworthiness.
| Original language | English (US) |
|---|---|
| Title of host publication | Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021 |
| Editors | Marc'Aurelio Ranzato, Alina Beygelzimer, Yann Dauphin, Percy S. Liang, Jenn Wortman Vaughan |
| Publisher | Neural information processing systems foundation |
| Pages | 21533-21544 |
| Number of pages | 12 |
| ISBN (Electronic) | 9781713845393 |
| State | Published - 2021 |
| Event | 35th Conference on Neural Information Processing Systems, NeurIPS 2021 - Virtual, Online Duration: Dec 6 2021 → Dec 14 2021 |
Publication series
| Name | Advances in Neural Information Processing Systems |
|---|---|
| Volume | 26 |
| ISSN (Print) | 1049-5258 |
Conference
| Conference | 35th Conference on Neural Information Processing Systems, NeurIPS 2021 |
|---|---|
| City | Virtual, Online |
| Period | 12/6/21 → 12/14/21 |
Bibliographical note
Funding Information:This research was funded in part by the NSF under Grants 1908711, 1849107, and in part supported by the National Research Foundation, Singapore under its Strategic Capability Research Centres Funding Initiative. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of National Research Foundation, Singapore.
Publisher Copyright:
© 2021 Neural information processing systems foundation. All rights reserved.
Fingerprint
Dive into the research topics of 'Learning to Predict Trustworthiness with Steep Slope Loss'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS