趋势：基于可传递性的强大合奏设计

论文标题

趋势：基于可传递性的强大合奏设计

TREND: Transferability based Robust ENsemble Design

论文作者

Ravikumar, Deepak, Kodge, Sangamesh, Garg, Isha, Roy, Kaushik

论文摘要

深度学习模型在许多领域都具有最新的表现，但是它们对对抗性例子的脆弱性构成了对实际环境中无处不在的部署的威胁。此外，已经证明在一个分类器上生成的对抗输入可以将其转移到接受类似数据的其他分类器中，即使没有向对手揭示模型参数，也可以将攻击变为可能。这种可转移性的属性尚未系统地研究，从而导致我们对神经网络对对抗性输入的鲁棒性的理解存在差距。在这项工作中，我们研究网络体系结构，初始化，优化器，输入，重量和激活量化对对抗样本的可传递性的影响。我们还研究了不同攻击对可转让性的影响。我们的实验表明，通过源和目标之间的输入量化和架构不匹配，可传递性受到初始化的影响，从而显着阻碍，但优化器的选择非常关键。我们观察到可传递性依赖于重量和激活量化模型。为了量化可传递性，我们使用简单的度量标准，并演示了指标在设计一种方法来构建具有改善对抗性鲁棒性的合奏的方法中的实用性。当攻击合奏时，我们会观察到单个合奏成员模型的“梯度统治”会阻碍现有的攻击。为了解决这个问题，我们提出了新的最新合奏攻击。我们将拟议的攻击与现有攻击技术进行比较，以显示其有效性。最后，我们表明，由精心选择的各种网络组成的合奏可以实现比单个网络以其他方式可以实现的对抗性鲁棒性。

Deep Learning models hold state-of-the-art performance in many fields, but their vulnerability to adversarial examples poses threat to their ubiquitous deployment in practical settings. Additionally, adversarial inputs generated on one classifier have been shown to transfer to other classifiers trained on similar data, which makes the attacks possible even if model parameters are not revealed to the adversary. This property of transferability has not yet been systematically studied, leading to a gap in our understanding of robustness of neural networks to adversarial inputs. In this work, we study the effect of network architecture, initialization, optimizer, input, weight and activation quantization on transferability of adversarial samples. We also study the effect of different attacks on transferability. Our experiments reveal that transferability is significantly hampered by input quantization and architectural mismatch between source and target, is unaffected by initialization but the choice of optimizer turns out to be critical. We observe that transferability is architecture-dependent for both weight and activation quantized models. To quantify transferability, we use simple metric and demonstrate the utility of the metric in designing a methodology to build ensembles with improved adversarial robustness. When attacking ensembles we observe that "gradient domination" by a single ensemble member model hampers existing attacks. To combat this we propose a new state-of-the-art ensemble attack. We compare the proposed attack with existing attack techniques to show its effectiveness. Finally, we show that an ensemble consisting of carefully chosen diverse networks achieves better adversarial robustness than would otherwise be possible with a single network.

下载PDF全文

下载文献需遵守相关版权规定

论文标题