论文标题
通过概率紧凑的损失来改善对抗性的鲁棒性
Improving Adversarial Robustness via Probabilistically Compact Loss with Logit Constraints
论文作者
论文摘要
卷积神经网络(CNN)在计算机视觉中的各种任务上都达到了最先进的表现。但是,最近的研究表明,这些模型容易受到精心制作的对抗样本的攻击,并且在预测它们时遭受了显着的性能下降。已经提出了许多方法来改善对抗性鲁棒性(例如,对抗性训练和新的损失功能,以学习对抗性稳健的特征表示形式)。在这里,我们对CNN的预测行为提供了独特的见解,它们倾向于将对抗样本误分类到最可能的错误类中。这激发了我们提出一种新的概率紧凑(PC)损失,并具有logit限制因素,可以用作跨凝胶损失(CE)损失的倒入,以改善CNN的对抗性鲁棒性。具体而言,PC损失扩大了真实类和错误类之间的概率差距,同时logit约束阻止了差距被小的扰动熔化。我们将我们的方法与最先进的方法进行了广泛的比较,并在白色框和黑框攻击下使用大型数据集进行了比较,以证明其有效性。源代码可从以下URL获得:https://github.com/xinli0928/pc-lc。
Convolutional neural networks (CNNs) have achieved state-of-the-art performance on various tasks in computer vision. However, recent studies demonstrate that these models are vulnerable to carefully crafted adversarial samples and suffer from a significant performance drop when predicting them. Many methods have been proposed to improve adversarial robustness (e.g., adversarial training and new loss functions to learn adversarially robust feature representations). Here we offer a unique insight into the predictive behavior of CNNs that they tend to misclassify adversarial samples into the most probable false classes. This inspires us to propose a new Probabilistically Compact (PC) loss with logit constraints which can be used as a drop-in replacement for cross-entropy (CE) loss to improve CNN's adversarial robustness. Specifically, PC loss enlarges the probability gaps between true class and false classes meanwhile the logit constraints prevent the gaps from being melted by a small perturbation. We extensively compare our method with the state-of-the-art using large scale datasets under both white-box and black-box attacks to demonstrate its effectiveness. The source codes are available from the following url: https://github.com/xinli0928/PC-LC.