Abstract
Synthetic data generation has become an emerging tool to help improve the adversarial robustness in classification tasks, since robust learning requires a significantly larger amount of training samples compared with standard classification. Among various deep generative models, the diffusion model has been shown to produce high-quality synthetic images and has achieved good performance in improving the adversarial robustness. However, diffusion-type methods are generally slower in data generation as compared with other generative models. Although different acceleration techniques have been proposed recently, it is also of great importance to study how to improve the sample efficiency of synthetic data for the downstream task. In this paper, we first analyze the optimality condition of synthetic distribution for achieving improved robust accuracy. We show that enhancing the distinguishability among the generated data is critical for improving adversarial robustness. Thus, we propose the Contrastive-Guided Diffusion Process (Contrastive-DP), which incorporates the contrastive loss to guide the diffusion model in data generation. We validate our theoretical results using simulations and demonstrate the good performance of Contrastive-DP on image datasets.
Original language | English (US) |
---|---|
Pages (from-to) | 26699-26723 |
Number of pages | 25 |
Journal | Proceedings of Machine Learning Research |
Volume | 202 |
State | Published - 2023 |
Externally published | Yes |
Event | 40th International Conference on Machine Learning, ICML 2023 - Honolulu, United States Duration: Jul 23 2023 → Jul 29 2023 |
Bibliographical note
Publisher Copyright:© 2023 Proceedings of Machine Learning Research. All rights reserved.