论文标题
特征蒸馏带有导向对比度学习
Feature Distillation With Guided Adversarial Contrastive Learning
论文作者
论文摘要
深度学习模型被证明容易受到对抗性例子的影响。尽管对抗训练可以增强模型鲁棒性,但典型的方法在计算上却很昂贵。最近提出的旨在将鲁棒性转移到具有软标签的不同任务或模型上的对抗性攻击中的作品。在本文中,我们提出了一种新颖的方法,称为“指导对抗性对比蒸馏(GACD)”,以有效地将对抗性的鲁棒性从具有特征的学生转移到学生。我们首先将此目标提出为对比度学习,并将其与相互信息联系起来。凭借训练有素的老师模型作为主持人,学生应提取与老师类似的功能。然后,考虑到教师犯的潜在错误,我们提出了样本重新加权的估计,以消除教师的负面影响。借助GACD,学生不仅学会了提取强大的功能,而且还可以捕获老师的结构知识。通过对CIFAR-10,CIFAR-100和STL-10等流行数据集进行评估的广泛实验,我们证明我们的方法可以有效地在不同的模型甚至不同的任务上传递鲁棒性,并与现有方法相比实现可比或更好的结果。此外,我们提供了各种方法的详细分析,表明我们的方法生产的学生从老师那里捕获了更多的结构知识,并在对抗性攻击下学习了更多可靠的功能。
Deep learning models are shown to be vulnerable to adversarial examples. Though adversarial training can enhance model robustness, typical approaches are computationally expensive. Recent works proposed to transfer the robustness to adversarial attacks across different tasks or models with soft labels.Compared to soft labels, feature contains rich semantic information and holds the potential to be applied to different downstream tasks. In this paper, we propose a novel approach called Guided Adversarial Contrastive Distillation (GACD), to effectively transfer adversarial robustness from teacher to student with features. We first formulate this objective as contrastive learning and connect it with mutual information. With a well-trained teacher model as an anchor, students are expected to extract features similar to the teacher. Then considering the potential errors made by teachers, we propose sample reweighted estimation to eliminate the negative effects from teachers. With GACD, the student not only learns to extract robust features, but also captures structural knowledge from the teacher. By extensive experiments evaluating over popular datasets such as CIFAR-10, CIFAR-100 and STL-10, we demonstrate that our approach can effectively transfer robustness across different models and even different tasks, and achieve comparable or better results than existing methods. Besides, we provide a detailed analysis of various methods, showing that students produced by our approach capture more structural knowledge from teachers and learn more robust features under adversarial attacks.