通过对抗功能扰动编码鲁棒性到图像样式

论文标题

通过对抗功能扰动编码鲁棒性到图像样式

Encoding Robustness to Image Style via Adversarial Feature Perturbations

论文作者

Shu, Manli, Wu, Zuxuan, Goldblum, Micah, Goldstein, Tom

论文摘要

对抗训练是生产对较小对抗性扰动的强大模型的行业标准。但是，机器学习从业人员需要自然发生的其他类型的变化，例如输入图像的样式或照明。输入分布的这种变化已被有效地建模为深层图像特征的平均值和方差的变化。我们通过直接扰动特征统计而不是图像像素来调整对抗训练，以产生对各种看不见的分布变化的强大模型。我们通过可视化对抗性特征来探索这些扰动与分布变化之间的关系。我们提出的方法，对抗批归一化（ADVBN），是一个单个网络层，在训练过程中会产生最坏情况的特征扰动。通过对对抗特征分布进行微调神经网络，我们观察到网络对各种看不见的分布变化的鲁棒性，包括样式变化和图像损坏。此外，我们表明我们提出的对抗特征扰动可以与现有的图像空间数据增强方法互补，从而改善了性能。源代码和预训练模型以\ url {https://github.com/azshue/advbn}发布。

Adversarial training is the industry standard for producing models that are robust to small adversarial perturbations. However, machine learning practitioners need models that are robust to other kinds of changes that occur naturally, such as changes in the style or illumination of input images. Such changes in input distribution have been effectively modeled as shifts in the mean and variance of deep image features. We adapt adversarial training by directly perturbing feature statistics, rather than image pixels, to produce models that are robust to various unseen distributional shifts. We explore the relationship between these perturbations and distributional shifts by visualizing adversarial features. Our proposed method, Adversarial Batch Normalization (AdvBN), is a single network layer that generates worst-case feature perturbations during training. By fine-tuning neural networks on adversarial feature distributions, we observe improved robustness of networks to various unseen distributional shifts, including style variations and image corruptions. In addition, we show that our proposed adversarial feature perturbation can be complementary to existing image space data augmentation methods, leading to improved performance. The source code and pre-trained models are released at \url{https://github.com/azshue/AdvBN}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题