对抗训练的线性分类器在高斯数据上的独特属性

论文标题

对抗训练的线性分类器在高斯数据上的独特属性

Unique properties of adversarially trained linear classifiers on Gaussian data

论文作者

Hayes, Jamie

论文摘要

机器学习模型容易受到对抗性扰动的影响，当将其添加到输入中时，可能会导致较高的置信度错误。对抗性学习研究界在理解对抗性扰动的根本原因方面取得了显着进步。但是，对于在安全临界任务中部署机器学习而可能认为要解决的大多数问题都涉及难以表征和研究的高维复杂歧管。关于简单问题的对抗性学习理论是常见的，希望洞察力将转移到“现实世界数据集”。在这项工作中，我们讨论了该方法失败的设置。特别是，我们通过线性分类器显示，在训练过程中，在任意级别的对抗性损坏级别下，在高斯数据上始终可以解决二进制分类问题，并且在CIFAR-10数据集上的非线性分类器未观察到此属性。

Machine learning models are vulnerable to adversarial perturbations, that when added to an input, can cause high confidence misclassifications. The adversarial learning research community has made remarkable progress in the understanding of the root causes of adversarial perturbations. However, most problems that one may consider important to solve for the deployment of machine learning in safety critical tasks involve high dimensional complex manifolds that are difficult to characterize and study. It is common to develop adversarially robust learning theory on simple problems, in the hope that insights will transfer to `real world datasets'. In this work, we discuss a setting where this approach fails. In particular, we show with a linear classifier, it is always possible to solve a binary classification problem on Gaussian data under arbitrary levels of adversarial corruption during training, and that this property is not observed with non-linear classifiers on the CIFAR-10 dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题