PATTATTACK：通过增强学习的基于黑框的纹理攻击

论文标题

PATTATTACK：通过增强学习的基于黑框的纹理攻击

PatchAttack: A Black-box Texture-based Attack with Reinforcement Learning

论文作者

Yang, Chenglin, Kortylewski, Adam, Xie, Cihang, Cao, Yinzhi, Yuille, Alan

论文摘要

基于补丁的攻击引发了可感知但本地化的变化，以引起错误分类的输入。当前基于补丁的黑盒攻击的限制是，它们在针对性攻击方面的表现不佳，即使对于较不具有挑战性的非目标方案，它们也需要大量查询。我们提出的PATCATTACT是有效的查询，可以打破针对目标和非目标攻击的模型。 PATTATTACK通过在输入图像上叠加小纹理补丁来诱导错误分类。我们通过特定于类的纹理字典来参数这些斑块的外观。该纹理词典是通过从VGG主链中的特征激活的革兰氏矩阵来学习的。 PATTATTACK使用增强学习优化每个补丁的位置和纹理参数。我们的实验表明，PatchAttack在广泛的体系结构上取得了> 99％的成功率，而仅操纵非目标攻击的图像的3％，而目标攻击平均为10％。此外，我们表明Patchattack成功地绕过了最新的对抗防御方法。

Patch-based attacks introduce a perceptible but localized change to the input that induces misclassification. A limitation of current patch-based black-box attacks is that they perform poorly for targeted attacks, and even for the less challenging non-targeted scenarios, they require a large number of queries. Our proposed PatchAttack is query efficient and can break models for both targeted and non-targeted attacks. PatchAttack induces misclassifications by superimposing small textured patches on the input image. We parametrize the appearance of these patches by a dictionary of class-specific textures. This texture dictionary is learned by clustering Gram matrices of feature activations from a VGG backbone. PatchAttack optimizes the position and texture parameters of each patch using reinforcement learning. Our experiments show that PatchAttack achieves > 99% success rate on ImageNet for a wide range of architectures, while only manipulating 3% of the image for non-targeted attacks and 10% on average for targeted attacks. Furthermore, we show that PatchAttack circumvents state-of-the-art adversarial defense methods successfully.

下载PDF全文

下载文献需遵守相关版权规定

论文标题