Many modern machine learning algorithms such as generative adversarial networks (GANs) and adversarial training can be formulated as minimax optimization. Gradient descent ascent (GDA) is the most commonly used algorithm due to its simplicity. However, GDA can converge to non-optimal minimax points. We propose a new minimax optimization framework, GDA-AM, that views the GDA dynamics as a fixed-point iteration and solves it using Anderson Mixing to converge to the local minimax. It addresses the diverging issue of simultaneous GDA and accelerates the convergence of alternating GDA. We show theoretically that the algorithm can achieve global convergence for bilinear problems under mild conditions. We also empirically show that GDA-AM solves a variety of minimax problems and improves adversarial training on several datasets. Codes are available on Github.
|Original language||English (US)|
|State||Published - 2022|
|Event||10th International Conference on Learning Representations, ICLR 2022 - Virtual, Online|
Duration: Apr 25 2022 → Apr 29 2022
|Conference||10th International Conference on Learning Representations, ICLR 2022|
|Period||4/25/22 → 4/29/22|
Bibliographical noteFunding Information:
This work was funded in part by the NSF grant OAC 2003720, IIS 1838200 and NIH grant 5R01LM013323-03,5K01LM012924-03.
© 2022 ICLR 2022 - 10th International Conference on Learning Representationss. All rights reserved.