TY - JOUR
T1 - Fractional ridge regression
T2 - a fast, interpretable reparameterization of ridge regression
AU - Rokem, Ariel
AU - Kay, Kendrick
N1 - Publisher Copyright:
© The Author(s) 2020. Published by Oxford University Press GigaScience.
Copyright:
This record is sourced from MEDLINE/PubMed, a database of the U.S. National Library of Medicine
PY - 2020/11/30
Y1 - 2020/11/30
N2 - BACKGROUND: Ridge regression is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. One of the challenges of using ridge regression is the need to set a hyperparameter (α) that controls the amount of regularization. Cross-validation is typically used to select the best α from a set of candidates. However, efficient and appropriate selection of α can be challenging. This becomes prohibitive when large amounts of data are analyzed. Because the selected α depends on the scale of the data and correlations across predictors, it is also not straightforwardly interpretable. RESULTS: The present work addresses these challenges through a novel approach to ridge regression. We propose to reparameterize ridge regression in terms of the ratio γ between the L2-norms of the regularized and unregularized coefficients. We provide an algorithm that efficiently implements this approach, called fractional ridge regression, as well as open-source software implementations in Python and matlab (https://github.com/nrdg/fracridge). We show that the proposed method is fast and scalable for large-scale data problems. In brain imaging data, we demonstrate that this approach delivers results that are straightforward to interpret and compare across models and datasets. CONCLUSION: Fractional ridge regression has several benefits: the solutions obtained for different γ are guaranteed to vary, guarding against wasted calculations; and automatically span the relevant range of regularization, avoiding the need for arduous manual exploration. These properties make fractional ridge regression particularly suitable for analysis of large complex datasets.
AB - BACKGROUND: Ridge regression is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. One of the challenges of using ridge regression is the need to set a hyperparameter (α) that controls the amount of regularization. Cross-validation is typically used to select the best α from a set of candidates. However, efficient and appropriate selection of α can be challenging. This becomes prohibitive when large amounts of data are analyzed. Because the selected α depends on the scale of the data and correlations across predictors, it is also not straightforwardly interpretable. RESULTS: The present work addresses these challenges through a novel approach to ridge regression. We propose to reparameterize ridge regression in terms of the ratio γ between the L2-norms of the regularized and unregularized coefficients. We provide an algorithm that efficiently implements this approach, called fractional ridge regression, as well as open-source software implementations in Python and matlab (https://github.com/nrdg/fracridge). We show that the proposed method is fast and scalable for large-scale data problems. In brain imaging data, we demonstrate that this approach delivers results that are straightforward to interpret and compare across models and datasets. CONCLUSION: Fractional ridge regression has several benefits: the solutions obtained for different γ are guaranteed to vary, guarding against wasted calculations; and automatically span the relevant range of regularization, avoiding the need for arduous manual exploration. These properties make fractional ridge regression particularly suitable for analysis of large complex datasets.
KW - brain imaging
KW - general linear model
KW - hyperparameters
KW - open-source software
UR - http://www.scopus.com/inward/record.url?scp=85097004368&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85097004368&partnerID=8YFLogxK
U2 - 10.1093/gigascience/giaa133
DO - 10.1093/gigascience/giaa133
M3 - Article
C2 - 33252656
AN - SCOPUS:85097004368
VL - 9
JO - GigaScience
JF - GigaScience
SN - 2047-217X
IS - 12
ER -