在对抗训练中使用辅助数据集诱导数据扩增

论文标题

在对抗训练中使用辅助数据集诱导数据扩增

Inducing Data Amplification Using Auxiliary Datasets in Adversarial Training

论文作者

Lee, Saehyung, Lee, Hyungyu

论文摘要

最近的一些研究表明，使用额外的分配数据可能会导致高水平的对抗性鲁棒性。但是，不能保证可以始终为所选数据集获得足够的额外数据。在本文中，我们提出了一种有偏见的多域对抗训练（BIAMAT）方法，该方法使用公开可用的辅助数据集诱导培训数据放大，而无需在主要和辅助数据集之间进行类分配匹配。提出的方法可以通过多域学习利用辅助数据集来实现主数据集上的对抗鲁棒性。具体而言，可以通过使用Biamat的应用来实现对鲁棒和非鲁棒特征的数据扩增，如通过理论和经验分析所证明的。此外，我们证明，尽管由于辅助和主要数据之间的分布差异，现有方法容易受到负转移的影响，但提出的方法使神经网络能够通过应用基于信心选择策略来成功处理域差异，从而灵活地利用各种图像数据集来进行对抗训练。预训练的模型和代码可在以下位置提供：\ url {https://github.com/saehyung-lee/biamat}。

Several recent studies have shown that the use of extra in-distribution data can lead to a high level of adversarial robustness. However, there is no guarantee that it will always be possible to obtain sufficient extra data for a selected dataset. In this paper, we propose a biased multi-domain adversarial training (BiaMAT) method that induces training data amplification on a primary dataset using publicly available auxiliary datasets, without requiring the class distribution match between the primary and auxiliary datasets. The proposed method can achieve increased adversarial robustness on a primary dataset by leveraging auxiliary datasets via multi-domain learning. Specifically, data amplification on both robust and non-robust features can be accomplished through the application of BiaMAT as demonstrated through a theoretical and empirical analysis. Moreover, we demonstrate that while existing methods are vulnerable to negative transfer due to the distributional discrepancy between auxiliary and primary data, the proposed method enables neural networks to flexibly leverage diverse image datasets for adversarial training by successfully handling the domain discrepancy through the application of a confidence-based selection strategy. The pre-trained models and code are available at: \url{https://github.com/Saehyung-Lee/BiaMAT}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题