比例不变的对抗攻击，用于评估和增强对抗性防御

论文标题

比例不变的对抗攻击，用于评估和增强对抗性防御

Scale-Invariant Adversarial Attack for Evaluating and Enhancing Adversarial Defenses

论文作者

Xu, Mengting, Zhang, Tao, Li, Zhongnian, Zhang, Daoqiang

论文摘要

高效有效的攻击对于可靠评估防御措施以及开发健壮模型至关重要。预计梯度下降（PGD）攻击已被证明是最成功的对抗性攻击之一。但是，标准PGD攻击的效果可以通过重新降低逻辑来轻易削弱，而每个输入的原始决定将不会更改。为了减轻此问题，在本文中，我们提出了规模不变的对抗攻击（SI-PGD），该攻击（SI-PGD）利用倒数第二层中的特征与软玛克斯层中的权重之间的角度来指导对手的产生。余弦角矩阵用于学习角度判别性表示，并且不会随着逻辑的重新制定而改变，从而使Si-PGD攻击变得稳定且有效。与现有攻击相比，我们评估对多种防御的攻击，并显示出改善的性能。此外，我们提出了基于余弦角矩阵的尺度不变（SI）对抗防御机制，可以将其嵌入流行的对抗性防御中。实验结果表明，我们的SI机制的防御方法在多步和单步防御方面达到了最先进的性能。

Efficient and effective attacks are crucial for reliable evaluation of defenses, and also for developing robust models. Projected Gradient Descent (PGD) attack has been demonstrated to be one of the most successful adversarial attacks. However, the effect of the standard PGD attack can be easily weakened by rescaling the logits, while the original decision of every input will not be changed. To mitigate this issue, in this paper, we propose Scale-Invariant Adversarial Attack (SI-PGD), which utilizes the angle between the features in the penultimate layer and the weights in the softmax layer to guide the generation of adversaries. The cosine angle matrix is used to learn angularly discriminative representation and will not be changed with the rescaling of logits, thus making SI-PGD attack to be stable and effective. We evaluate our attack against multiple defenses and show improved performance when compared with existing attacks. Further, we propose Scale-Invariant (SI) adversarial defense mechanism based on the cosine angle matrix, which can be embedded into the popular adversarial defenses. The experimental results show the defense method with our SI mechanism achieves state-of-the-art performance among multi-step and single-step defenses.

下载PDF全文

下载文献需遵守相关版权规定

论文标题