通过学习优化器改进对抗训练

论文标题

通过学习优化器改进对抗训练

Improved Adversarial Training via Learned Optimizer

论文作者

Xiong, Yuanhao, Hsieh, Cho-Jui

论文摘要

对抗性攻击最近已成为对深度学习模型的巨大威胁。为了提高机器学习模型的鲁棒性，被认为是最小值优化问题的对抗性训练已被认为是最有效的防御机制之一。但是，非凸和非concave属性对Minimax培训构成了巨大挑战。在本文中，我们从经验上证明，常用的PGD攻击可能不是内部最大化的最佳选择，并且改进的内部优化器可以导致更健壮的模型。然后，我们利用学习对学习的框架（L2L）框架来培训具有经常性神经网络的优化器，从而为内部问题提供更新方向和步骤。通过共同培训优化器的参数和模型的权重，提议的框架始终提高基于PGD的对抗训练和交易的模型鲁棒性。

Adversarial attack has recently become a tremendous threat to deep learning models. To improve the robustness of machine learning models, adversarial training, formulated as a minimax optimization problem, has been recognized as one of the most effective defense mechanisms. However, the non-convex and non-concave property poses a great challenge to the minimax training. In this paper, we empirically demonstrate that the commonly used PGD attack may not be optimal for inner maximization, and improved inner optimizer can lead to a more robust model. Then we leverage a learning-to-learn (L2L) framework to train an optimizer with recurrent neural networks, providing update directions and steps adaptively for the inner problem. By co-training optimizer's parameters and model's weights, the proposed framework consistently improves the model robustness over PGD-based adversarial training and TRADES.

下载PDF全文

下载文献需遵守相关版权规定

论文标题