对抗性分配培训，以进行健壮的深度学习

论文标题

对抗性分配培训，以进行健壮的深度学习

Adversarial Distributional Training for Robust Deep Learning

论文作者

Dong, Yinpeng, Deng, Zhijie, Pang, Tianyu, Su, Hang, Zhu, Jun

论文摘要

对抗性训练（AT）是通过用对抗性示例增强训练数据来提高模型鲁棒性的最有效技术之一。但是，大多数现有的方法都采取了特定的攻击来制作对抗性例子，从而导致对其他看不见的攻击的鲁棒性。此外，单个攻击算法可能不足以探索扰动的空间。在本文中，我们介绍了对抗性分布训练（ADT），这是一种学习强大模型的新型框架。 ADT被称为最小值优化问题，其中内部最大化旨在学习对抗性分布，以表征围绕熵正常化的天然天然示例的潜在对抗示例，而外部最小化旨在通过在最差的对手分布中最小化预期的预期损失来训练强大的模型。通过理论分析，我们开发了一种用于求解ADT的一般算法，并提出了三种用于参数化对抗分布的方法，从典型的高斯分布到灵活的隐式分布。与方法最新的方法相比，几个基准的经验结果验证了ADT的有效性。

Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. However, most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. Besides, a single attack algorithm could be insufficient to explore the space of perturbations. In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models. ADT is formulated as a minimax optimization problem, where the inner maximization aims to learn an adversarial distribution to characterize the potential adversarial examples around a natural one under an entropic regularizer, and the outer minimization aims to train robust models by minimizing the expected loss over the worst-case adversarial distributions. Through a theoretical analysis, we develop a general algorithm for solving ADT, and present three approaches for parameterizing the adversarial distributions, ranging from the typical Gaussian distributions to the flexible implicit ones. Empirical results on several benchmarks validate the effectiveness of ADT compared with the state-of-the-art AT methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题