Understanding Backdoor Attacks through the Adaptability Hypothesis

Xun Xian, Ganghua Wang, Jayanth Srinivasa, Ashish Kundu, Xuan Bi, Mingyi Hong, Jie Ding

Research output: Contribution to journalConference articlepeer-review

4 Scopus citations

Abstract

A poisoning backdoor attack is a rising security concern for deep learning. This type of attack can result in the backdoored model functioning normally most of the time but exhibiting abnormal behavior when presented with inputs containing the backdoor trigger, making it difficult to detect and prevent. In this work, we propose the adaptability hypothesis to understand when and why a backdoor attack works for general learning models, including deep neural networks, based on the theoretical investigation of classical kernel-based learning models. The adaptability hypothesis postulates that for an effective attack, the effect of incorporating a new dataset on the predictions of the original data points will be small, provided that the original data points are distant from the new dataset. Experiments on benchmark image datasets and state-of-the-art backdoor attacks for deep neural networks are conducted to corroborate the hypothesis. Our finding provides insight into the factors that affect the attack's effectiveness and has implications for the design of future attacks and defenses.

Original languageEnglish (US)
Pages (from-to)37952-37976
Number of pages25
JournalProceedings of Machine Learning Research
Volume202
StatePublished - 2023
Event40th International Conference on Machine Learning, ICML 2023 - Honolulu, United States
Duration: Jul 23 2023Jul 29 2023

Bibliographical note

Publisher Copyright:
© 2023 Proceedings of Machine Learning Research. All rights reserved.

Fingerprint

Dive into the research topics of 'Understanding Backdoor Attacks through the Adaptability Hypothesis'. Together they form a unique fingerprint.

Cite this