Abstract
Relying on prior knowledge accumulated from related tasks, meta-learning offers a powerful approach to learning a novel task from limited training data. Recent approaches parameterize the prior with a family of probability density functions or recurrent neural networks, whose parameters can be optimized by utilizing validation data from the observed tasks. While these approaches have appealing empirical performance, the expressiveness of their prior is relatively low, which limits the generalization and interpretation of meta-learning. Aiming at expressive yet meaningful priors, this contribution puts forth a novel prior representation model that leverages the notion of algorithm unrolling. The key idea is to unroll the proximal gradient descent steps, where learnable piecewise linear functions are developed to approximate the desired proximal operators within tight theoretical error bounds established for both smooth and non-smooth proximal functions. The resultant multi-block neural network not only broadens the scope of learnable priors, but also enhances interpretability from an optimization viewpoint. Numerical tests conducted on few-shot learning datasets demonstrate markedly improved performance with flexible, visualizable, and understandable priors.
Original language | English (US) |
---|---|
State | Published - 2024 |
Event | 12th International Conference on Learning Representations, ICLR 2024 - Hybrid, Vienna, Austria Duration: May 7 2024 → May 11 2024 |
Conference
Conference | 12th International Conference on Learning Representations, ICLR 2024 |
---|---|
Country/Territory | Austria |
City | Hybrid, Vienna |
Period | 5/7/24 → 5/11/24 |
Bibliographical note
Publisher Copyright:© 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved.