Demystifying the Optimal Performance of Multi-Class Classification

Minoh Jeong, Martina Cardone, Alex Dytso

    Research output: Contribution to journalConference articlepeer-review

    2 Scopus citations

    Abstract

    Classification is a fundamental task in science and engineering on which machine learning methods have shown outstanding performances. However, it is challenging to determine whether such methods have achieved the Bayes error rate, that is, the lowest error rate attained by any classifier. This is mainly due to the fact that the Bayes error rate is not known in general and hence, effectively estimating it is paramount. Inspired by the work by Ishida et al. (2023), we propose an estimator for the Bayes error rate of supervised multi-class classification problems. We analyze several theoretical aspects of such estimator, including its consistency, unbiasedness, convergence rate, variance, and robustness. We also propose a denoising method that reduces the noise that potentially corrupts the data labels, and we improve the robustness of the proposed estimator to outliers by incorporating the median-of-means estimator. Our analysis demonstrates the consistency, asymptotic unbiasedness, convergence rate, and robustness of the proposed estimators. Finally, we validate the effectiveness of our theoretical results via experiments both on synthetic data under various noise settings and on real data.

    Original languageEnglish (US)
    JournalAdvances in Neural Information Processing Systems
    Volume36
    StatePublished - 2023
    Event37th Conference on Neural Information Processing Systems, NeurIPS 2023 - New Orleans, United States
    Duration: Dec 10 2023Dec 16 2023

    Bibliographical note

    Publisher Copyright:
    © 2023 Neural information processing systems foundation. All rights reserved.

    Fingerprint

    Dive into the research topics of 'Demystifying the Optimal Performance of Multi-Class Classification'. Together they form a unique fingerprint.

    Cite this