论文标题
通过未标记的室外数据改善对抗性鲁棒性
Improving Adversarial Robustness via Unlabeled Out-of-Domain Data
论文作者
论文摘要
通过合并来自多个域的廉价未标记数据来增加数据的数据是改进预测的强大方法,尤其是在标记数据有限的情况下。在这项工作中,我们调查了如何通过利用外域未标记的数据来增强对抗性鲁棒性。我们证明,对于广泛的分布和分类器,存在标准和鲁棒分类之间的样本复杂性差距。我们通过提供上限和下限来量化从移位域中的未标记样品来弥合该差距在多大程度上。此外,当未标记的数据来自转移的域而不是与标记的数据相同的域时,我们显示了我们实现更好的对抗性鲁棒性的设置。当标记和未标记的域之间共享某些结构信息(例如稀疏性)时,我们还研究了如何利用室外数据。在实验上,我们可以通过易于获取和未标记的室外数据来增强两个对象识别数据集(CIFAR-10和SVHN),并证明了模型对$ \ ell_ \ ell_ \ elfty $ infty $ verty $ verseversial攻击的鲁棒性。
Data augmentation by incorporating cheap unlabeled data from multiple domains is a powerful way to improve prediction especially when there is limited labeled data. In this work, we investigate how adversarial robustness can be enhanced by leveraging out-of-domain unlabeled data. We demonstrate that for broad classes of distributions and classifiers, there exists a sample complexity gap between standard and robust classification. We quantify to what degree this gap can be bridged via leveraging unlabeled samples from a shifted domain by providing both upper and lower bounds. Moreover, we show settings where we achieve better adversarial robustness when the unlabeled data come from a shifted domain rather than the same domain as the labeled data. We also investigate how to leverage out-of-domain data when some structural information, such as sparsity, is shared between labeled and unlabeled domains. Experimentally, we augment two object recognition datasets (CIFAR-10 and SVHN) with easy to obtain and unlabeled out-of-domain data and demonstrate substantial improvement in the model's robustness against $\ell_\infty$ adversarial attacks on the original domain.