朝着零级天然梯度下降的查询效率的黑盒对手

论文标题

朝着零级天然梯度下降的查询效率的黑盒对手

Towards Query-Efficient Black-Box Adversary with Zeroth-Order Natural Gradient Descent

论文作者

Zhao, Pu, Chen, Pin-Yu, Wang, Siyue, Lin, Xue

论文摘要

尽管现代深层神经网络（DNN）取得了巨大成就，但最先进的DNN的脆弱性/鲁棒性引起了许多需要高可靠性的应用领域的安全问题。提出了各种对抗性攻击，以破坏DNN模型的学习表现。其中，由于其实用性和简单性，黑盒对抗攻击方法已受到特殊关注。黑盒攻击通常更喜欢查询，以保持隐形和低成本。但是，当前的大多数黑盒攻击方法都采用了一阶梯度下降方法，该方法可能带来某些缺陷，例如相对较慢的收敛性和对高参数设置的高灵敏度。在本文中，我们提出了一种零级自然梯度下降（ZO-NGD）方法来设计对抗性攻击，该攻击结合了零阶梯度估计技术，以迎合黑盒攻击方案和二阶自然梯度下降，以实现更高的查询效率。图像分类数据集的经验评估表明，与最先进的攻击方法相比，ZO-NGD可以获得明显较低的模型查询复杂性。

Despite the great achievements of the modern deep neural networks (DNNs), the vulnerability/robustness of state-of-the-art DNNs raises security concerns in many application domains requiring high reliability. Various adversarial attacks are proposed to sabotage the learning performance of DNN models. Among those, the black-box adversarial attack methods have received special attentions owing to their practicality and simplicity. Black-box attacks usually prefer less queries in order to maintain stealthy and low costs. However, most of the current black-box attack methods adopt the first-order gradient descent method, which may come with certain deficiencies such as relatively slow convergence and high sensitivity to hyper-parameter settings. In this paper, we propose a zeroth-order natural gradient descent (ZO-NGD) method to design the adversarial attacks, which incorporates the zeroth-order gradient estimation technique catering to the black-box attack scenario and the second-order natural gradient descent to achieve higher query efficiency. The empirical evaluations on image classification datasets demonstrate that ZO-NGD can obtain significantly lower model query complexities compared with state-of-the-art attack methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题