Cross-Fitted Residual Regression for High-Dimensional Heteroscedasticity Pursuit

Research output: Contribution to journalArticlepeer-review


There is a vast amount of work on high-dimensional regression. The common starting point for the existing theoretical work is to assume the data generating model is a homoscedastic linear regression model with some sparsity structure. In reality the homoscedasticity assumption is often violated, and hence understanding the heteroscedasticity of the data is of critical importance. In this article we systematically study the estimation of a high-dimensional heteroscedastic regression model. In particular, the emphasis is on how to detect and estimate the heteroscedasticity effects reliably and efficiently. To this end, we propose a cross-fitted residual regression approach and prove the resulting estimator is selection consistent for heteroscedasticity effects and establish its rates of convergence. Our estimator has tuning parameters to be determined by the data in practice. We propose a novel high-dimensional BIC for tuning parameter selection and establish its consistency. This is the first high-dimensional BIC result under heteroscedasticity. The theoretical analysis is more involved in order to handle heteroscedasticity, and we develop a couple of interesting new concentration inequalities that are of independent interests.

Original languageEnglish (US)
JournalJournal of the American Statistical Association
StateAccepted/In press - 2021

Bibliographical note

Publisher Copyright:
© 2021 American Statistical Association.


  • HBIC
  • Heteroscedasticity
  • High dimension
  • Model selection criterion
  • Sparsity


Dive into the research topics of 'Cross-Fitted Residual Regression for High-Dimensional Heteroscedasticity Pursuit'. Together they form a unique fingerprint.

Cite this