论文标题
虚张声势:对深度神经网络的互动破译对抗性攻击
Bluff: Interactively Deciphering Adversarial Attacks on Deep Neural Networks
论文作者
论文摘要
深度神经网络(DNN)现在通常在许多领域中使用。但是,它们容易受到对抗性攻击的攻击:对数据输入的精心设计,这些数据输入可能会欺骗模型做出错误的预测。尽管研究了开发DNN攻击和防御技术的大量研究,但人们仍然缺乏对这种攻击如何渗透模型内部的了解。我们提出了Bluff,这是一种用于可视化,表征和破译对基于视觉神经网络的对抗性攻击的交互式系统。虚张声势允许人们灵活地可视化和比较良性和攻击图像的激活途径,从而揭示了对对抗性攻击对模型造成伤害的机制。虚张声势是开源的,并在现代网络浏览器中运行。
Deep neural networks (DNNs) are now commonly used in many domains. However, they are vulnerable to adversarial attacks: carefully crafted perturbations on data inputs that can fool a model into making incorrect predictions. Despite significant research on developing DNN attack and defense techniques, people still lack an understanding of how such attacks penetrate a model's internals. We present Bluff, an interactive system for visualizing, characterizing, and deciphering adversarial attacks on vision-based neural networks. Bluff allows people to flexibly visualize and compare the activation pathways for benign and attacked images, revealing mechanisms that adversarial attacks employ to inflict harm on a model. Bluff is open-sourced and runs in modern web browsers.