虚张声势：对深度神经网络的互动破译对抗性攻击

论文标题

虚张声势：对深度神经网络的互动破译对抗性攻击

Bluff: Interactively Deciphering Adversarial Attacks on Deep Neural Networks

论文作者

Das, Nilaksh, Park, Haekyu, Wang, Zijie J., Hohman, Fred, Firstman, Robert, Rogers, Emily, Chau, Duen Horng

论文摘要

深度神经网络（DNN）现在通常在许多领域中使用。但是，它们容易受到对抗性攻击的攻击：对数据输入的精心设计，这些数据输入可能会欺骗模型做出错误的预测。尽管研究了开发DNN攻击和防御技术的大量研究，但人们仍然缺乏对这种攻击如何渗透模型内部的了解。我们提出了Bluff，这是一种用于可视化，表征和破译对基于视觉神经网络的对抗性攻击的交互式系统。虚张声势允许人们灵活地可视化和比较良性和攻击图像的激活途径，从而揭示了对对抗性攻击对模型造成伤害的机制。虚张声势是开源的，并在现代网络浏览器中运行。

Deep neural networks (DNNs) are now commonly used in many domains. However, they are vulnerable to adversarial attacks: carefully crafted perturbations on data inputs that can fool a model into making incorrect predictions. Despite significant research on developing DNN attack and defense techniques, people still lack an understanding of how such attacks penetrate a model's internals. We present Bluff, an interactive system for visualizing, characterizing, and deciphering adversarial attacks on vision-based neural networks. Bluff allows people to flexibly visualize and compare the activation pathways for benign and attacked images, revealing mechanisms that adversarial attacks employ to inflict harm on a model. Bluff is open-sourced and runs in modern web browsers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题