Abstract
In this paper we analyse, for a model of linear regression with Gaussian covariates, the high-dimensional limit of the performance of a Bayesian estimator given by the mean of a log-concave posterior distribution with Gaussian prior. Although high-dimensional analysis of Bayesian estimators has been previously studied for Bayesian-optimal linear regression where the correct posterior is used for inference, much less is known when there is a mismatch. Here we consider a model in which the responses are known to be generated as linear combinations of the covariates but the distribution of the ground-truth regression coefficients and the Gaussian noise's variance are unknown. This regression task can be rephrased as a statistical mechanics model known as the Gardner spin glass, an analogy that we exploit. Using a leave-one-out approach we characterize the mean square error for the regression coefficients. We also derive the log-normalizing constant of the posterior. Similar models have been studied by Shcherbina and Tirozzi and by Talagrand, but our arguments are much more straightforward. An interesting consequence of our analysis is that in the quadratic loss case, the performance of the Bayesian estimator is independent of a global 'temperature' hyperparameter and matches the ridge estimator: sampling and optimizing are equally good. Instead, for the absolute value loss, there is an optimal finite temperature to select, which allows the Bayesian estimator to beat the corresponding M-estimator.
| Original language | English (US) |
|---|---|
| Article number | iaaf019 |
| Journal | Information and Inference |
| Volume | 14 |
| Issue number | 3 |
| DOIs | |
| State | Published - Sep 1 2025 |
Bibliographical note
Publisher Copyright:© 2025 Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications.
Keywords
- disordered systems
- high-dimensional Bayesian inference
- statistical mechanics