论文标题
使用混合对抗训练进行健壮的胶囊自动编码器
Towards Robust Stacked Capsule Autoencoder with Hybrid Adversarial Training
论文作者
论文摘要
胶囊网络(CAPSNETS)是新的神经网络,可以根据特征的空间关系对图像进行分类。通过分析特征及其相对位置的姿势,它可以更有能力识别仿射转换后的图像。堆叠的胶囊自动编码器(SCAE)是一种最先进的封闭式胶囊,首次实现了无监督的封装分类。但是,很少探索SCAE的安全漏洞和鲁棒性。在本文中,我们提出了针对SCAE的逃避攻击,攻击者可以基于减少与图像原始类别相关的SCAE中对象胶囊的贡献而产生对抗性扰动。然后将对抗性扰动应用于原始图像,并且被扰动的图像将被错误分类。此外,我们提出了一种防御方法,称为混合对抗训练(HAT),以防止这种逃避攻击。 HAT利用对抗性训练和对抗性蒸馏来实现更好的鲁棒性和稳定性。我们评估防御方法,实验结果表明,在逃避攻击下,精制的SCAE模型可以达到82.14%的分类精度。源代码可从https://github.com/frostbitexsw/scae_defense获得。
Capsule networks (CapsNets) are new neural networks that classify images based on the spatial relationships of features. By analyzing the pose of features and their relative positions, it is more capable to recognize images after affine transformation. The stacked capsule autoencoder (SCAE) is a state-of-the-art CapsNet, and achieved unsupervised classification of CapsNets for the first time. However, the security vulnerabilities and the robustness of the SCAE has rarely been explored. In this paper, we propose an evasion attack against SCAE, where the attacker can generate adversarial perturbations based on reducing the contribution of the object capsules in SCAE related to the original category of the image. The adversarial perturbations are then applied to the original images, and the perturbed images will be misclassified. Furthermore, we propose a defense method called Hybrid Adversarial Training (HAT) against such evasion attacks. HAT makes use of adversarial training and adversarial distillation to achieve better robustness and stability. We evaluate the defense method and the experimental results show that the refined SCAE model can achieve 82.14% classification accuracy under evasion attack. The source code is available at https://github.com/FrostbiteXSW/SCAE_Defense.