贝叶斯神经网络对基于梯度的攻击的鲁棒性

论文标题

贝叶斯神经网络对基于梯度的攻击的鲁棒性

Robustness of Bayesian Neural Networks to Gradient-Based Attacks

论文作者

Carbone, Ginevra, Wicker, Matthew, Laurenti, Luca, Patane, Andrea, Bortolussi, Luca, Sanguinetti, Guido

论文摘要

对对抗攻击的脆弱性是在安全至关重要应用中采用深度学习的主要障碍之一。尽管付出了巨大的努力，无论是实际的还是理论上的，问题仍然开放。在本文中，我们分析了大数据，贝叶斯神经网络（BNNS）中的对抗攻击的几何形状。我们表明，在限制下，由于数据分布的变性而导致基于梯度的攻击的脆弱性，即当数据位于环境空间的较低维度的亚策略上时。直接结果，我们证明了BNN的后代对基于梯度的对抗攻击是强大的。对MNIST和时尚MNIST数据集的实验结果，该数据集的BNN和Hamiltonian Monte Carlo训练的BNN和变异推理支持这一论点，这表明BNN可以表现出对基于梯度的对抗性攻击的高精度和稳健性。

Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, the problem remains open. In this paper, we analyse the geometry of adversarial attacks in the large-data, overparametrized limit for Bayesian Neural Networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lies on a lower-dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in the limit BNN posteriors are robust to gradient-based adversarial attacks. Experimental results on the MNIST and Fashion MNIST datasets with BNNs trained with Hamiltonian Monte Carlo and Variational Inference support this line of argument, showing that BNNs can display both high accuracy and robustness to gradient based adversarial attacks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题