针对未知攻击的泛化性对抗防御技术综述
Generalized adversarial defense against unseen attacks: a survey
- 2024年29卷第7期 页码:1787-1813
纸质出版日期: 2024-07-16
DOI: 10.11834/jig.230423
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-07-16 ,
移动端阅览
周大为, 徐一搏, 王楠楠, 刘德成, 彭春蕾, 高新波. 2024. 针对未知攻击的泛化性对抗防御技术综述. 中国图象图形学报, 29(07):1787-1813
Zhou Dawei, Xu Yibo, Wang Nannan, Liu Decheng, Peng Chunlei, Gao Xinbo. 2024. Generalized adversarial defense against unseen attacks: a survey. Journal of Image and Graphics, 29(07):1787-1813
在计算机视觉领域,对抗样本是一种包含攻击者所精心设计的扰动的样本,该样本与其对应的自然样本的差异通常难以被人眼察觉,却极易导致深度学习模型输出错误结果。深度学习模型的这种脆弱性引起了社会各界的广泛关注,与之相对应的对抗防御技术得到了极大发展。然而,随着攻击技术和应用环境的不断发展变化,仅实现针对特定类型的对抗扰动的鲁棒性显然无法进一步满足深度学习模型的性能要求。由此,在尽可能不依赖对抗样本的情况下,通过更高效的训练方式和更少的训练次数,达到一次性防御任意种类的未知攻击的目标,是当下亟待解决的问题。期望所防御的未知攻击要有尽可能强的未知性,要在原理、性能上尽可能彻底地不同于训练阶段引入的攻击。为进一步了解未知攻击的对抗防御技术的发展现状,本文以上述防御目标为核心,对本领域的研究工作进行全面、系统的总结归纳。首先简要介绍了研究背景,对防御研究所面临的困难与挑战进行了简要说明。将未知对抗攻击的防御工作分为面向训练机制的方法和面向模型架构的方法。对于面向训练机制的方法,根据防御模型所涉及的最基本的训练框架,从对抗训练、自然训练以及对比学习3个角度阐述相关工作。对于面向模型架构的方法,根据模型结构的修改方式从目标模型结构优化、输入数据预处理两个角度分析相关研究。最后,分析了现有未知攻击防御机制的研究规律,同时介绍了其他相关的防御研究方向,揭示了未知攻击防御研究的整体发展趋势。不同于一般对抗防御综述,本文注重在未知性极强的攻击上的防御的调研与分析,对防御机制的泛化性、通用性提出了更高的要求,希望能为未来防御机制的研究提供更多有益的思考。
Deep learning-based models have achieved impressive breakthroughs in various areas in recent years. However, they are vulnerable when their inputs are affected by imperceptible but adversarial noises, which can easily lead to wrong outputs. To tackle this problem, many defense methods have been proposed to mitigate the effect from these threat models for deep neural networks. As adversaries seek to improve the technologies of disrupting the models’ performances, an increasing number of attacks that are unseen to the model during the training process are emerging. Thus, the defense mechanism, which defends against only some specific types of adversarial perturbations, is becoming less robust. The ability of a model to generally defend against various unseen attacks becomes pivotal. Unseen attacks should be as different as possible from the attacks used in the training process in terms of theory and attack performance rather than adjustment of parameters from the same attack method. The core is to defend against any attacks via efficient training procedures, while the defense is expected to be as independent as possible from adversarial attacks during training. Our survey aims to summarize and analyze the existing adversarial defense methods against unseen adversarial attacks. We first briefly review the background of defending against unseen attacks. One of the main reasons that the model is robust against unseen attacks is that it can extract robust features through a specially designed training mechanism without explicitly designing a defense mechanism that has special internal structures. A robust model can be achieved by modifying its structure or designing additional modules. Therefore, we divide these methods into two categories: training mechanism-based defense and model structure-based defense. The former mainly seeks to improve the quality of the robust feature extracted by the model via its training process. 1) Adversarial training is one of the most effective adversarial defense strategies, but it can easily overfit to some specific types of adversarial noises. Well-designed attacks for training can explicitly improve the model’s ability to explore the perturbation space during training, which directly helps the model learn more representative features compared with traditional adversarial attacks in the perturbation space. Adding regularization terms is another way to obtain robust models by improving the robust features from the basic training process. Furthermore, we introduce some adversarial training-based methods combined with knowledge from other domains, such as domain adaptation, pre-training, and fine tuning. Different examples make different contributions to the model’s robustness. Thus, example reweighting is also a way to achieve robustness against attacks. 2) Standard training is the most basic training method in deep learning. Data augmentation methods focus on example diversity of standard training, while adding regularization terms into standard training aims to enhance the model outputs’ stabilization. Pre-training strategy aims to achieve a robust model within a predefined perturbation bound. 3) We also found that contrastive learning is a useful strategy as its core ideas about feature similarity match well with the goal of acquiring representative robust features. Model structure-based defense, meanwhile, mainly focuses on intrinsic drawbacks from the model’s structure. It is divided into structure optimization for target network methods and input data pre-processing methods according to how the structures are modified. 1) Structure optimization for target network aims to enhance the model’s ability to obtain useful information from inputs and features because the network itself is susceptible to variations from them. 2) Input data pre-processing focuses on eliminating the threats from examples before feeding them into the target network. Removing adversarial noise from inputs or detecting adversarial examples to reject them are two popular strategies because they are easily modeled and rely less on adversarial training examples compared with other methods such as adversarial training. Finally, we analyze the trends of research in this area and summarize some research on other related domains. 1) Defending against multiple adversarial perturbation well cannot make sure that the model is robust against various unseen attacks but contributes to the improvement of robustness against one specific type of perturbation. 2) With the development of defense against unseen adversarial attacks, some auxiliary tools such as the accelerating module have been proposed. 3) Defense against unseen common corruptions is beneficial for applications of defense methods because adversarial perturbations cannot represent the whole perturbation space in the real world. To summarize, defending against attacks that are totally different from the attacks during training has stronger generalizability. The analysis based on this goal shows differences from traditional surveys about adversarial defense. We hope that this survey can further motivate research on defending against unseen adversarial attacks.
对抗防御未知对抗攻击对抗训练数据预处理深度学习
adversarial defenseunseen adversarial attacksadversarial trainingdata pre-processingdeep learning
Abdel-Hamid O, Mohamed A R, Jiang H, Deng L, Penn G and Yu D. 2014. Convolutional neural networks for speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(10): 1533-1545 [DOI: 10.1109/TASLP.2014.2339736http://dx.doi.org/10.1109/TASLP.2014.2339736]
Abusnaina A, Wu Y H, Arora S, Wang Y Z, Wang F, Yang H and Mohaisen D. 2021. Adversarial example detection using latent neighborhood graph//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 7687-7696 [DOI: 10.1109/ICCV48922.2021.00759http://dx.doi.org/10.1109/ICCV48922.2021.00759]
Agnihotri S, Jung S and Keuper M. 2023. CosPGD: a unified white-box adversarial attack for pixel-wise prediction tasks [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/2302.02213.pdfhttps://arxiv.org/pdf/2302.02213.pdf
Akhtar N and Mian A. 2018. Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access, 6: 14410-14430 [DOI: 10.1109/ACCESS.2018.2807385http://dx.doi.org/10.1109/ACCESS.2018.2807385]
Azizmalayeri M and Rohban M H. 2023. Lagrangian objective function leads to improved unforeseen attack generalization. Machine Learning, 112(8): 3003-3031 [DOI: 10.1007/s10994-023-06348-3http://dx.doi.org/10.1007/s10994-023-06348-3]
Bashivan P, Bayat R, Ibrahim A, Ahuja K, Faramarzi M, Laleh T, Richards B A and Rish I. 2021. Adversarial feature desensitization//Proceedings of the 35th International Conference on Neural Information Processing Systems. [s.l.]: [s.n.]: 10665-10677
Blau T, Ganz R, Kawar B, Bronstein A and Elad M. 2022. Threat model-agnostic adversarial defense using diffusion models [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/2207.08089.pdfhttps://arxiv.org/pdf/2207.08089.pdf
Boopathy A, Liu S J, Zhang G Y, Liu C, Chen P Y, Chang S Y and Daniel L. 2020. Proper network interpretability helps adversarial robustness in classification//Proceedings of the 37th International Conference on Machine Learning. [s.l.]: JMLR.org: #95
Borkar T, Heide F and Karam L. 2020. Defending against universal attacks through selective feature regeneration//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 709-719 [DOI: 10.1109/CVPR42600.2020.00079http://dx.doi.org/10.1109/CVPR42600.2020.00079]
Carlini N and Wagner D. 2017. Towards evaluating the robustness of neural networks//Proceedings of 2017 IEEE Symposium on Security and Privacy. San Jose, USA: IEEE: 39-57 [DOI: 10.1109/SP.2017.49http://dx.doi.org/10.1109/SP.2017.49]
Chen T, Kornblith S, Norouzi M and Hinton G. 2020a. A simple framework for contrastive learning of visual representations//Proceedings of the 37th International Conference on Machine Learning. [s.l.]: JMLR.org: #149
Chen T L, Liu S J, Chang S Y, Cheng Y, Amini L and Wang Z Y. 2020b. Adversarial robustness: from self-supervised pre-training to fine-tuning//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 699-708 [DOI: 10.1109/CVPR42600.2020.00078http://dx.doi.org/10.1109/CVPR42600.2020.00078]
Chen T L, Zhang Z Y, Liu S J, Chang S Y and Wang Z Y. 2021. Robust overfitting may be mitigated by properly learned smoothening//Proceedings of the 9th International Conference on Learning Representations. [s.l.]: OpenReview.net
Chhabra S, Agarwal A, Singh R and Vatsa M. 2021. Attack agnostic adversarial defense via visual imperceptible bound//Proceedings of the 25th International Conference on Pattern Recognition. Milan, Italy: IEEE: 5302-5309 [DOI: 10.1109/ICPR48806.2021.9412663http://dx.doi.org/10.1109/ICPR48806.2021.9412663]
Cohen G, Sapiro G and Giryes R. 2020. Detecting adversarial samples using influence functions and nearest neighbors//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 14453-14462 [DOI: 10.1109/CVPR42600.2020.01446http://dx.doi.org/10.1109/CVPR42600.2020.01446]
Croce F and Hein M. 2019. Provable robustness against all adversarial lp-perturbations for p ≥ 1 [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/1905.11213.pdfhttps://arxiv.org/pdf/1905.11213.pdf
Cui X M, Aparcedo A, Jang Y K and Lim S N. 2023. On the robustness of large multimodal models against image adversarial attacks [EB/OL]. [2024-01-22]. https://arxiv.org/pdf/2312.03777.pdfhttps://arxiv.org/pdf/2312.03777.pdf
Dai L R, Zhang S L and Huang Z Y. 2017. Deep learning for speech recognition: review of state-of-the-arts technologies and prospects. Journal of Data Acquisition and Processing, 32(2): 221-231
戴礼荣, 张仕良, 黄智颖. 2017. 基于深度学习的语音识别技术现状与展望. 数据采集与处理, 32(2): 221-231 [DOI: 10.16337/j.1004-9037.2017.02.002http://dx.doi.org/10.16337/j.1004-9037.2017.02.002]
de Jorge Aranda P, Bibi A, Volpi R, Sanyal A, Torr P H S, Rogez G and Dokania P K. 2022. Make some noise: reliable and efficient single-step adversarial training//Proceedings of the 36th International Conference on Neural Information Processing Systems. New Orleans, USA: [s.n.]: 12881-12893
Dolatabadi H M, Erfani S and Leckie C. 2022. ℓ∞-robustness and beyond: unleashing efficient adversarial training//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer: 467-483 [DOI: 10.1007/978-3-031-20083-0_28http://dx.doi.org/10.1007/978-3-031-20083-0_28]
Dong Y P, Deng Z J, Pang T Y, Zhu J and Su H. 2020. Adversarial distributional training for robust deep learning//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc.: #693
Drenkow N, Fendley N and Burlina P. 2022. Attack agnostic detection of adversarial examples via random subspace analysis//Proceedings of 2022 IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa, USA: IEEE: 472-482 [DOI: 10.1109/WACV51458.2022.00287http://dx.doi.org/10.1109/WACV51458.2022.00287]
Gan Z, Chen Y C, Li L J, Zhu C, Cheng Y and Liu J J. 2020. Large-scale adversarial training for vision-and-language representation learning//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc.: #555
Gao R Z, Liu F, Zhou K W, Niu G, Han B and Cheng J. 2021. Local reweighting for adversarial training [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/2106.15776.pdfhttps://arxiv.org/pdf/2106.15776.pdf
Gao S, Wang R X, Wang X X, Yu S, Dong Y Y, Yao S W and Zhou W. 2023. Detecting adversarial examples on deep neural networks with mutual information neural estimation. IEEE Transactions on Dependable and Secure Computing, 20(6): 5168-5181 [DOI: 10.1109/TDSC.2023.3241428http://dx.doi.org/10.1109/TDSC.2023.3241428]
Gong Y F, Yao Y G, Li Y Z, Zhang Y M, Liu X M, Lin X and Liu S J. 2022. Reverse engineering of imperceptible adversarial image perturbations [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/2203.14145.pdfhttps://arxiv.org/pdf/2203.14145.pdf
Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A and Bengio Y. 2014. Generative adversarial networks [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/1406.2661.pdfhttps://arxiv.org/pdf/1406.2661.pdf
Goodfellow I J, Shlens J and Szegedy C. 2015. Explaining and harnessing adversarial examples//Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: [s.n.]
He K M, Chen X L, Xie S N, Li Y H, Dollár P and Girshick R. 2022. Masked autoencoders are scalable vision learners//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 16000-16009 [DOI: 10.1109/CVPR52688.2022.01553http://dx.doi.org/10.1109/CVPR52688.2022.01553]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
He T, Zhang Z, Zhang H, Zhang Z Y, Xie J Y and Li M. 2019. Bag of tricks for image classification with convolutional neural networks//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 558-567 [DOI: 10.1109/CVPR.2019.00065http://dx.doi.org/10.1109/CVPR.2019.00065]
Hendrycks D and Dietterich T G. 2019. Benchmarking neural network robustness to common corruptions and perturbations//Proceedings of the 7th International Conference on Learning Representations. New Orleans, USA: OpenReview.net
Ho J, Lee B G and Kang D K. 2022. Attack-less adversarial training for a robust adversarial defense. Applied Intelligence, 52(4): 4364-4381 [DOI: 10.1007/s10489-021-02523-yhttp://dx.doi.org/10.1007/s10489-021-02523-y]
Hsiung L, Tsai Y Y, Chen P Y and Ho T Y. 2023. Towards compositional adversarial robustness: generalizing adversarial training to composite semantic perturbations//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 24658-24667 [DOI: 10.1109/CVPR52729.2023.02362http://dx.doi.org/10.1109/CVPR52729.2023.02362]
Ibrahim A, Guille-Escuret C, Mitliagkas I, Rish I, Krueger D and Bashivan P. 2022. Towards out-of-distribution adversarial robustness [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/2210.03150.pdfhttps://arxiv.org/pdf/2210.03150.pdf
Jiang Z Y, Chen T L, Chen T and Wang Z Y. 2020. Robust pre-training by adversarial contrastive learning//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc.: #1359
Jiao R C, Liu X G, Sato T, Chen Q A and Zhu Q. 2023. Semi-supervised semantics-guided adversarial training for robust trajectory prediction//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: #754 [DOI: 10.1109/ICCV51070.2023.00754http://dx.doi.org/10.1109/ICCV51070.2023.00754]
Jin C and Rinard M. 2020. Manifold regularization for locally stable deep neural networks [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/2003.04286.pdfhttps://arxiv.org/pdf/2003.04286.pdf
Jin G Q, Shen S W, Zhang D M, Dai F and Zhang Y D. 2019. APE-GAN: adversarial perturbation elimination with GAN//Proceedings of 2019 IEEE International Conference on Acoustics, Speech and Signal Processing. Brighton, UK: IEEE: 3842-3846 [DOI: 10.1109/ICASSP.2019.8683044http://dx.doi.org/10.1109/ICASSP.2019.8683044]
Kaufmann M, Kang D, Sun Y, Basart S, Yin X W, Mazeika M, Arora A, Dziedzic A, Boenisch F, Brown T, Steinhardt J and Hendrycks D. 2019. Testing robustness against unforeseen adversaries [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/1908.08016.pdfhttps://arxiv.org/pdf/1908.08016.pdf
Kim M, Tack J and Hwang S J. 2020. Adversarial self-supervised contrastive learning//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc.: #251
Kingma D P and Welling M. 2014. Auto-encoding variational bayes//Proceedings of the 2nd International Conference on Learning Representations. Banff, Canada: [s.n.]
Krizhevsky A, Sutskever I and Hinton G E. 2017. ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6): 84-90 [DOI: 10.1145/3065386http://dx.doi.org/10.1145/3065386]
Krueger D, Caballero E, Jacobsen J H, Zhang A, Binas J, Zhang D H, Le Priol R and Courville A. 2021. Out-of-distribution generalization via risk extrapolation (REx)//Proceedings of the 38th International Conference on Machine Learning. [s.l.]: PMLR: 5815-5826
Laidlaw C, Singla S and Feizi S. 2021. Perceptual adversarial robustness: defense against unseen threat models//Proceedings of the 9th International Conference on Learning Representations. [s.l.]: OpenReview.net
Lau C P, Liu J, Souri H, Lin W A, Feizi S and Chellappa R. 2023. Interpolated joint space adversarial training for robust and generalizable defenses. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11): 13054-13067 [DOI: 10.1109/TPAMI.2023.3286772http://dx.doi.org/10.1109/TPAMI.2023.3286772]
LeCun Y, Bottou L, Bengio Y and Haffner P. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11): 2278-2324 [DOI: 10.1109/5.726791http://dx.doi.org/10.1109/5.726791]
Levi M, Attias I and Kontorovich A. 2021. Domain invariant adversarial learning [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/2104.00322.pdfhttps://arxiv.org/pdf/2104.00322.pdf
Li H F, Zeng Y R, Li G B, Lin L and Yu Y Z. 2020. Online alternate generator against adversarial attacks. IEEE Transactions on Image Processing, 29: 9305-9315 [DOI: 10.1109/TIP.2020.3025404http://dx.doi.org/10.1109/TIP.2020.3025404]
Li J C, Zhang S H, Cao J Z and Tan M K. 2023a. Learning defense transformations for counterattacking adversarial examples. Neural Networks, 164: 177-185 [DOI: 10.1016/j.neunet.2023.03.008http://dx.doi.org/10.1016/j.neunet.2023.03.008]
Li K C, Wang X Q, Lin H, Li L X, Yang Y Y, Meng C and Gao J. 2022. Survey of one-stage small object detection methods in deep learning. Journal of Frontiers of Computer Science and Technology, 16(1): 41-58
李科岑, 王晓强, 林浩, 李雷孝, 杨艳艳, 孟闯, 高静. 2022. 深度学习中的单阶段小目标检测方法综述. 计算机科学与探索, 16(1): 41-58 [DOI: 10.3778/j.issn.1673-9418.2110003http://dx.doi.org/10.3778/j.issn.1673-9418.2110003]
Li Y, Cheng M H, Hsieh C J and Lee T C M. 2022. A review of adversarial attack and defense for classification methods. The American Statistician, 76(4): 329-345 [DOI: 10.1080/00031305.2021.2006781http://dx.doi.org/10.1080/00031305.2021.2006781]
Li Z X, Yin B J, Yao T P, Guo J F, Ding S H, Chen S M and Liu C. 2023b. Sibling-attack: rethinking transferable adversarial attacks against face recognition//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 24626-24637 [DOI: 10.1109/CVPR52729.2023.02359http://dx.doi.org/10.1109/CVPR52729.2023.02359]
Liang H S, He E L, Zhao Y Y, Jia Z and Li H. 2022. Adversarial attack and defense: a survey. Electronics, 11(8): #1283 [DOI: 10.3390/electronics11081283http://dx.doi.org/10.3390/electronics11081283]
Liao F Z, Liang M, Dong Y P, Pang T Y, Hu X L and Zhu J. 2018. Defense against adversarial attacks using high-level representation guided denoiser//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 1778-1787 [DOI: 10.1109/CVPR.2018.00191http://dx.doi.org/10.1109/CVPR.2018.00191]
Lin W A, Lau C P, Levine A, Chellappa R and Feizi S. 2020. Dual manifold adversarial robustness: defense against Lp and non-Lp adversarial attacks//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc.:# 294
Liu A S, Tang S Y, Liu X L, Chen X Y, Huang L, Tu Z Z, Song D and Tao D C. 2020. Towards defending multiple adversarial perturbations via gated batch normalization [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/2012.01654v1.pdfhttps://arxiv.org/pdf/2012.01654v1.pdf
Liu H H, Zuo X Q, Huang H and Wan X. 2022. Saliency map-based local white-box adversarial attack against deep neural networks//Proceedings of the 2nd CAAI International Conference on Artificial Intelligence. Beijing, China: Springer: 3-14 [DOI: 10.1007/978-3-031-20500-2_1http://dx.doi.org/10.1007/978-3-031-20500-2_1]
Lyu H Y, Yu L, Zhou X Y and Deng X. 2021. Review of semi-supervised deep learning image classification methods. Journal of Frontiers of Computer Science and Technology, 15(6): 1038-1048
吕昊远, 俞璐, 周星宇, 邓祥. 2021. 半监督深度学习图像分类方法研究综述. 计算机科学与探索, 15(6): 1038-1048 [DOI: 10.3778/j.issn.1673-9418.2011020http://dx.doi.org/10.3778/j.issn.1673-9418.2011020]
Madaan D, Shin J and Hwang S J. 2021. Learning to generate noise for multi-attack robustness//Proceedings of the 38th International Conference on Machine Learning. [s.l.]: PMLR: 7279-7289
Madry A, Makelov A, Schmidt L, Tsipras D and Vladu A. 2018. Towards deep learning models resistant to adversarial attacks//Proceedings of the 6th International Conference on Learning Representations. Vancouver, Canada: OpenReview.net
Maini P, Wong E and Kolter J Z. 2020. Adversarial robustness against the union of multiple perturbation models//Proceedings of the 37th International Conference on Machine Learning. [s.l.]: JMLR.org: #616
Mao C Z, Chiquier M, Wang H, Yang J F and Vondrick C. 2021. Adversarial attacks are reversible with natural supervision//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 661-671 [DOI: 10.1109/ICCV48922.2021.00070http://dx.doi.org/10.1109/ICCV48922.2021.00070]
Mao C Z, Zhong Z Y, Yang J F, Vondrick C and Ray B. 2019. Metric learning for adversarial robustness//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc.: #44
Moayeri M and Feizi S. 2021. Sample efficient detection and classification of adversarial attacks via self-supervised embeddings//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 7677-7686 [DOI: 10.1109/ICCV48922.2021.00758http://dx.doi.org/10.1109/ICCV48922.2021.00758]
Modas A, Rade R, Ortiz-Jiménez G, Moosavi-Dezfooli S M and Frossard P. 2022. PRIME: a few primitives can boost robustness to common corruptions//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer: 623-640 [DOI: 10.1007/978-3-031-19806-9_36http://dx.doi.org/10.1007/978-3-031-19806-9_36]
Modas A, Sanchez-Matilla R, Frossard P and Cavallaro A. 2020. Toward robust sensing for autonomous vehicles: an adversarial perspective. IEEE Signal Processing Magazine, 37(4): 14-23 [DOI: 10.1109/MSP.2020.2985363http://dx.doi.org/10.1109/MSP.2020.2985363]
Moosavi-Dezfooli S M, Fawzi A and Frossard P. 2016. DeepFool: a simple and accurate method to fool deep neural networks//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 2574-2582 [DOI: 10.1109/CVPR.2016.282http://dx.doi.org/10.1109/CVPR.2016.282]
Nandy J, Hsu W and Lee M L. 2020. Approximate manifold defense against multiple adversarial perturbations//Proceedings of 2020 International Joint Conference on Neural Networks. Glasgow, UK: IEEE: 1-8 [DOI: 10.1109/IJCNN48605.2020.9207312http://dx.doi.org/10.1109/IJCNN48605.2020.9207312]
Nie W L, Guo B, Huang Y J, Xiao C W, Vahdat A and Anandkumar A. 2022. Diffusion models for adversarial purification//Proceedings of the 39th International Conference on Machine Learning. Baltimore, USA: PMLR: 16805-16827
Poursaeed O, Jiang T X, Yang H, Belongie S and Lim S N. 2021. Robustness and generalization via generative adversarial training//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 15711-15720 [DOI: 10.1109/ICCV48922.2021.01542http://dx.doi.org/10.1109/ICCV48922.2021.01542]
Ren S Q,He K M,Girshick R and Sun J. 2015. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and machine intelligence, 2016, 39(6): 1137-1149
Rice L, Wong E and Kolter J Z. 2020. Overfitting in adversarially robust deep learning//Proceedings of the 37th International Conference on Machine Learning. [s.l.]: PMLR: 8093-8104
Roth K, Kilcher Y and Hofmann T. 2019. The odds are odd: a statistical test for detecting adversarial examples//Proceedings of the 36th International Conference on Machine Learning. Long Beach, USA: PMLR: 5498-5507
Samangouei P, Kabkab M and Chellappa R. 2018. Defense-GAN: protecting classifiers against adversarial attacks using generative models//Proceedings of the 6th International Conference on Learning Representations. Vancouver, Canada: OpenReview.net
Sarkar A, Sarkar A and Balasubramanian V N. 2022. Leveraging test-time consensus prediction for robustness against unseen noise//Proceedings of 2022 IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa, USA: IEEE: 1839-1848 [DOI: 10.1109/WACV51458.2022.00362http://dx.doi.org/10.1109/WACV51458.2022.00362]
Schott L, Rauber J, Bethge M and Brendel W. 2019. Towards the first adversarially robust neural network model on MNIST//Proceedings of the 7th International Conference on Learning Representations. New Orleans, USA: OpenReview.net
Shu M L, Wu Z X, Goldblum M and Goldstein T. 2021. Encoding robustness to image style via adversarial feature perturbations//Proceedings of the 35th International Conference on Neural Information Processing Systems. [s.l.]: [s.n.]: 28042-28053
Silva S H, Das A, Aladdini A and Najafirad P. 2022. Adaptive clustering of robust semantic representations for adversarial image purification on social networks//Proceedings of the 16th International AAAI Conference on Web and Social Media. Atlanta, USA: AAAI: 968-979 [DOI: 10.1609/icwsm.v16i1.19350http://dx.doi.org/10.1609/icwsm.v16i1.19350]
Song C B, Fan Y B, Yang Y C, Wu B Y, Li Y M, Li Z F and He K. 2021. Regional adversarial training for better robust generalization [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/2109.00678v1.pdfhttps://arxiv.org/pdf/2109.00678v1.pdf
Sridhar K, Dutta S, Kaur R, Weimer J, Sokolsky O and Lee I. 2022. Towards alternative techniques for improving adversarial robustness: analysis of adversarial training at a spectrum of perturbations [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/2206.06496.pdfhttps://arxiv.org/pdf/2206.06496.pdf
Sriramanan G, Addepalli S, Baburaj A and Venkatesh Babu R. 2021. Towards efficient and effective adversarial training//Proceedings of the 35th International Conference on Neural Information Processing Systems. [s.l.]: [s.n.]: 11821-11833
Sriramanan G, Gor M and Feizi S. 2022. Toward efficient robust training against union of ℓp threat models//Proceedings of the 39th International Conference on Machine Learning. New Orleans, USA: PMLR: 25870-25882
Stutz D, Hein M and Schiele B. 2020. Confidence-calibrated adversarial training: generalizing to unseen attacks//Proceedings of the 37th International Conference on Machine Learning. [s.l.]: JMLR.org:# 849
Sun C H, Zhang Y G, Wan C Q, Wang Q Z, Li Y, Liu T L, Han B and Tian X M. 2022. Towards lightweight black-box attacks against deep neural networks//Proceedings of the 36th Conference on Neural Information Processing Systems. [s.l.]: [s.n.]: 19319-19331
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I J and Fergus R. 2014. Intriguing properties of neural networks//Proceedings of the 2nd International Conference on Learning Representations. Banff, Canada: [s.n.]
Tack J, Yu S, Jeong J, Kim M, Hwang S J and Shin J. 2022. Consistency regularization for adversarial robustness//Proceedings of the 36th AAAI Conference on Artificial Intelligence. [s.l.]: AAAI: 8414-8422 [DOI: 10.1609/aaai.v36i8.20817http://dx.doi.org/10.1609/aaai.v36i8.20817]
Tramèr F and Boneh D. 2019. Adversarial training and robustness for multiple perturbations//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc.: #527
Tsai Y Y, Chao J C, Wen A, Yang Z Y, Mao C Z, Shah T and Yang J F. 2023. Test-time detection and repair of adversarial samples via masked autoencoder [EB/OL]. [2024-01-22]. https://arxiv.org/pdf/2303.12848.pdfhttps://arxiv.org/pdf/2303.12848.pdf
Wahed M, Tabassum A and Lourentzou I. 2022. Adversarial contrastive learning by permuting cluster assignments [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/2204.10314.pdfhttps://arxiv.org/pdf/2204.10314.pdf
Wang J K, Zhang T Y, Liu S J, Chen P Y, Xu J C, Fardad M and Li B. 2021. Adversarial attack generation empowered by min-max optimization//Proceedings of the 35th International Conference on Neural Information Processing Systems. [s.l.]: [s.n.]: 16020-16033
Wang S and Gong Y X. 2022. Adversarial example detection based on saliency map features. Applied Intelligence, 52(6): 6262-6275 [DOI: 10.1007/s10489-021-02759-8http://dx.doi.org/10.1007/s10489-021-02759-8]
Wang X, Li K, Xu M J and Ning C. 2019. Improved remote sensing image classification algorithm based on deep learning. Journal of Computer Applications, 39(2): 382-387
王鑫, 李可, 徐明君, 宁晨. 2019. 改进的基于深度学习的遥感图像分类算法. 计算机应用, 39(2): 382-387 [DOI: 10.11772/j.issn.1001-9081.2018061324http://dx.doi.org/10.11772/j.issn.1001-9081.2018061324]
Wang Z K, Pang T Y, Du C, Lin M, Liu W W and Yan S C. 2023. Better diffusion models further improve adversarial training//Proceedings of the 40th International Conference on Machine Learning. Honolulu, USA: JMLR.org: #1507
Wen S X, Rios A and Itti L. 2020. Beneficial perturbations network for defending adversarial examples [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/2009.12724.pdfhttps://arxiv.org/pdf/2009.12724.pdf
Weng Z Z, Qin Z J, Tao X M, Pan C K, Liu G Y and Li G Y. 2023. Deep learning enabled semantic communications with speech recognition and synthesis. IEEE Transactions on Wireless Communications, 22(9): 6227-6240 [DOI: 10.1109/TWC.2023.3240969http://dx.doi.org/10.1109/TWC.2023.3240969]
Williams P N and Li K. 2023. Black-box sparse adversarial attack via multi-objective optimisation CVPR proceedings//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 12291-12301 [DOI: 10.1109/CVPR52729.2023.01183http://dx.doi.org/10.1109/CVPR52729.2023.01183]
Xie C H, Tan M X, Gong B Q, Wang J, Yuille A L and Le Q V. 2020. Adversarial examples improve image recognition//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: #90 [DOI: 10.1109/CVPR42600.2020.00090http://dx.doi.org/10.1109/CVPR42600.2020.00090]
Xie C H, Wu Y X, van der Maaten L, Yuille A L and He K M. 2019. Feature denoising for improving adversarial robustness//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 501-509 [DOI: 10.1109/CVPR.2019.00059http://dx.doi.org/10.1109/CVPR.2019.00059]
Xie C H and Yuille A L. 2020. Intriguing properties of adversarial training at scale//Proceedings of the 8th International Conference on Learning Representations. Addis Ababa, Ethiopia: OpenReview.net
Xu X G, Zhao H S and Jia J Y. 2021. Dynamic divide-and-conquer adversarial training for robust semantic segmentation//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 7486-7495 [DOI: 10.1109/ICCV48922.2021.00739http://dx.doi.org/10.1109/ICCV48922.2021.00739]
Xu X G, Zhao H S, Torr P and Jia J Y. 2022. General adversarial defense against black-box attacks via pixel level and feature level distribution alignments [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/2212.05387.pdfhttps://arxiv.org/pdf/2212.05387.pdf
Xue J Q, Zheng M X, Hua T, Shen Y L, Liu Y P, Bölöni L and Lou Q. 2023. TrojLLM: a black-box trojan prompt attack on large language models//Proceedings of the 37th International Conference on Neural Information Processing Systems. New Orleans, USA: [s.n.]
Yang K, Lin W Y, Barman M, Condessa F and Kolter Z. 2021a. Defending multimodal fusion models against single-source adversaries//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 3340-3349 [DOI: 10.1109/CVPR46437.2021.00335http://dx.doi.org/10.1109/CVPR46437.2021.00335]
Yang K W, Zhou T Y, Zhang Y G, Tian X M and Tao D C. 2021b. Class-disentanglement and applications in adversarial detection and defense//Proceedings of the 35th International Conference on Neural Information Processing Systems. [s.l.]: [s.n.]: 16051-16063
Yi J W, Xie Y Q, Zhu B, Kiciman E, Sun G Z, Xie X and Wu F Z. 2023. Benchmarking and defending against indirect prompt injection attacks on large language models [EB/OL]. [2024-01-22]. https://arxiv.org/pdf/2312.14197.pdfhttps://arxiv.org/pdf/2312.14197.pdf
Yin F, Zhang Y, Wu B Y, Feng Y, Zhang J Y, Fan Y B and Yang Y J. 2024. Generalizable black-box adversarial attack with meta learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(3): 1804-1818 [DOI: 10.1109/TPAMI.2022.3194988http://dx.doi.org/10.1109/TPAMI.2022.3194988]
Yoon J, Hwang S J and Lee J. 2021. Adversarial purification with score-based generative models//Proceedings of the 38th International Conference on Machine Learning. [s.l.]: PMLR: 12062-12072
Yu F X, Xu Z R, Wang Y Z, Liu C C and Chen X. 2018. Towards robust training of neural networks by regularizing adversarial gradients [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/1805.09370.pdfhttps://arxiv.org/pdf/1805.09370.pdf
Yuan L, Li X M, Pan Z X, Sun J M and Xiao L. 2022. Review of adversarial examples for object detection. Journal of Image and Graphics, 27(10): 2873-2896
袁珑, 李秀梅, 潘振雄, 孙军梅, 肖蕾. 2022. 面向目标检测的对抗样本综述. 中国图象图形学报, 27(10): 2873-2896 [DOI: 10.11834/jig.210209http://dx.doi.org/10.11834/jig.210209]
Zhang B, Zhu J and Su H. 2020. Toward the third generation of artificial intelligence. SCIENTIA SINICA Informationis, 50(9): 1281-1302
张钹, 朱军, 苏航. 2020. 迈向第三代人工智能. 中国科学: 信息科学, 50(9): 1281-1302 [DOI: 10.1360/SSI-2020-0204http://dx.doi.org/10.1360/SSI-2020-0204]
Zhang H Y, Yu Y D, Jiao J T, Xing E, El Ghaoui L and Jordan M. 2019. Theoretically principled trade-off between robustness and accuracy//Proceedings of the 36th International Conference on Machine Learning. Long Beach, USA: PMLR: 7472-7482
Zhang R, Isola P, Efros A A, Shechtman E and Wang O. 2018. The unreasonable effectiveness of deep features as a perceptual metric//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 586-595 [DOI: 10.1109/CVPR.2018.00068http://dx.doi.org/10.1109/CVPR.2018.00068]
Zheng T Y, Chen Z, Ding S Y, Cai C and Luo J. 2024. Adv-4-Adv: thwarting changing adversarial perturbations via adversar-ial domain adaptation. Neurocomputing, 569: #127114 [DOI: 10.1016/j.neucom.2023.127114http://dx.doi.org/10.1016/j.neucom.2023.127114]
Zheng X, Fan Y B, Wu B Y, Zhang Y, Wang J and Pan S R. 2023. Robust physical-world attacks on face recognition. Pattern Recognition, 133: #109009 [DOI: 10.1016/j.patcog.2022.109009http://dx.doi.org/10.1016/j.patcog.2022.109009]
Zheng Z H and Hong P Y. 2018. Robust detection of adversarial attacks by modeling the intrinsic properties of deep neural networks//Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal, Canada: Curran Associates Inc.: 7924-7933
Zhou D W, Liu T L, Han B, Wang N N, Peng C L and Gao X B. 2021a. Towards defending against adversarial examples via attack-invariant features//Proceedings of the 38th International Conference on Machine Learning. [s.l.]: PMLR: 12835-12845
Zhou D W, Wang N N, Gao X B, Han B, Yu J, Wang X Y and Liu T L. 2021b. Improving white-box robustness of pre-processing defenses via joint adversarial training [EB/OL]. [2023-06-07]. https://arxiv.org/pdf/2106.05453.pdfhttps://arxiv.org/pdf/2106.05453.pdf
Zhou D W, Wang N N, Peng C L, Gao X B, Wang X Y, Yu J and Liu T L. 2021c. Removing adversarial noise in class activation feature space//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 7878-7887 [DOI: 10.1109/ICCV48922.2021.00778http://dx.doi.org/10.1109/ICCV48922.2021.00778]
Zhu K J, Hu X X, Wang J D, Xie X and Yang G. 2023. Improving generalization of adversarial training via robust critical fine-tuning//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 4424-4434 [DOI: 10.1109/ICCV51070.2023.00408http://dx.doi.org/10.1109/ICCV51070.2023.00408]
相关作者
相关机构