Sparse group lasso: Consistency and climate applications

Soumyadeep Chatterjee, Karsten Steinhaeuser, Arindam Banerjee, Singdhansu B Chatterjee, Auroop Ganguly

Research output: Chapter in Book/Report/Conference proceedingConference contribution

36 Scopus citations

Abstract

The design of statistical predictive models for climate data gives rise to some unique challenges due to the high dimensionality and spatio-temporal nature of the datasets, which dictate that models should exhibit parsimony in variable selection. Recently, a class of methods which promote structured sparsity in the model have been developed, which is suitable for this task. In this paper, we prove theoretical statistical consistency of estimators with tree-structured norm regularizers. We consider one particular model, the Sparse Group Lasso (SGL), to construct predictors of land climate using ocean climate variables. Our experimental results demonstrate that the SGL model provides better predictive performance than the current state-of-the-art, remains climatologically interpretable, and is robust in its variable selection.

Original languageEnglish (US)
Title of host publicationProceedings of the 12th SIAM International Conference on Data Mining, SDM 2012
Pages47-58
Number of pages12
StatePublished - Dec 1 2012
Event12th SIAM International Conference on Data Mining, SDM 2012 - Anaheim, CA, United States
Duration: Apr 26 2012Apr 28 2012

Publication series

NameProceedings of the 12th SIAM International Conference on Data Mining, SDM 2012

Other

Other12th SIAM International Conference on Data Mining, SDM 2012
CountryUnited States
CityAnaheim, CA
Period4/26/124/28/12

Keywords

  • Climate prediction
  • Sparse group lasso
  • Statistical consistency

Cite this

Chatterjee, S., Steinhaeuser, K., Banerjee, A., Chatterjee, S. B., & Ganguly, A. (2012). Sparse group lasso: Consistency and climate applications. In Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012 (pp. 47-58). (Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012).