对抗性强大的模型的奇怪案例：更多数据可以帮助，逐渐下降或伤害概括

论文标题

对抗性强大的模型的奇怪案例：更多数据可以帮助，逐渐下降或伤害概括

The Curious Case of Adversarially Robust Models: More Data Can Help, Double Descend, or Hurt Generalization

论文作者

Min, Yifei, Chen, Lin, Karbasi, Amin

论文摘要

对抗训练表明，其能够生产对输入数据扰动的模型的能力，但通常以降低标准准确性为代价。为了减轻此问题，人们普遍认为，更多的培训数据最终将有助于这种对抗性强大的模型在良性/不受干扰的测试数据上更好地推广。但是，在本文中，我们挑战了这种传统的信念，并表明更多的培训数据可能会损害分类问题中对抗性强大模型的概括。我们首先使用线性损失研究高斯混合物分类，并根据对手的强度确定三个方案。在弱的对手制度中，更多的数据改善了对手稳健模型的概括。在中等对手方案中，随着更多的训练数据，泛化损失表现出双重下降曲线，这意味着存在更多训练数据会损害概括的中间阶段。在强大的对手制度中，更多的数据几乎立即导致概括误差增加。然后，我们将以0-1的损失对二维分类问题进行分析。我们证明，更多的数据总是损害具有较大扰动的对抗训练模型的概括性能。为了补充我们的理论结果，我们就高斯混合物分类，支持向量机（SVM）和线性回归进行实证研究。

Adversarial training has shown its ability in producing models that are robust to perturbations on the input data, but usually at the expense of decrease in the standard accuracy. To mitigate this issue, it is commonly believed that more training data will eventually help such adversarially robust models generalize better on the benign/unperturbed test data. In this paper, however, we challenge this conventional belief and show that more training data can hurt the generalization of adversarially robust models in the classification problems. We first investigate the Gaussian mixture classification with a linear loss and identify three regimes based on the strength of the adversary. In the weak adversary regime, more data improves the generalization of adversarially robust models. In the medium adversary regime, with more training data, the generalization loss exhibits a double descent curve, which implies the existence of an intermediate stage where more training data hurts the generalization. In the strong adversary regime, more data almost immediately causes the generalization error to increase. Then we move to the analysis of a two-dimensional classification problem with a 0-1 loss. We prove that more data always hurts the generalization performance of adversarially trained models with large perturbations. To complement our theoretical results, we conduct empirical studies on Gaussian mixture classification, support vector machines (SVMs), and linear regression.

下载PDF全文

下载文献需遵守相关版权规定

论文标题