论文标题
评估简单的再培训策略作为防御对抗攻击的防御
Evaluating a Simple Retraining Strategy as a Defense Against Adversarial Attacks
论文作者
论文摘要
尽管深度神经网络(DNN)表现出比计算机视觉,自然语言处理,机器人技术等主要领域的其他技术的优越性,但已证明它们容易受到对抗攻击的影响。在原始输入图像中添加简单,小且几乎看不见的扰动可用于欺骗DNNS做出错误的决定。随着设计更多的攻击算法,出现了捍卫神经网络免受此类攻击的需求。用对抗图像重新训练网络是最简单的技术之一。在本文中,我们评估了这种再训练策略在防御对抗攻击方面的有效性。我们还展示了如何使用简单的算法来确定再培训所需的对抗图像的标签。我们在两个标准数据集上介绍了结果,即CIFAR-10和Tinyimagenet。
Though deep neural networks (DNNs) have shown superiority over other techniques in major fields like computer vision, natural language processing, robotics, recently, it has been proven that they are vulnerable to adversarial attacks. The addition of a simple, small and almost invisible perturbation to the original input image can be used to fool DNNs into making wrong decisions. With more attack algorithms being designed, a need for defending the neural networks from such attacks arises. Retraining the network with adversarial images is one of the simplest techniques. In this paper, we evaluate the effectiveness of such a retraining strategy in defending against adversarial attacks. We also show how simple algorithms like KNN can be used to determine the labels of the adversarial images needed for retraining. We present the results on two standard datasets namely, CIFAR-10 and TinyImageNet.