对抗扰动的TOP-K预测的几乎紧密的L0-NORM认证的鲁棒性

论文标题

对抗扰动的TOP-K预测的几乎紧密的L0-NORM认证的鲁棒性

Almost Tight L0-norm Certified Robustness of Top-k Predictions against Adversarial Perturbations

论文作者

Jia, Jinyuan, Wang, Binghui, Cao, Xiaoyu, Liu, Hongbin, Gong, Neil Zhenqiang

论文摘要

TOP-K预测用于许多现实世界中的应用程序，例如机器学习，包括服务，推荐系统和Web搜索。 $ \ ell_0 $ -Norm对抗扰动表征了一次攻击，该攻击任意修改输入的某些功能，以使分类器对扰动输入做出错误的预测。 $ \ ell_0 $ -norm对抗扰动易于解释，并且可以在物理世界中实现。因此，对$ \ ell_0 $ norm对抗扰动的顶级$ K $预测的鲁棒性很重要。但是，现有的研究重点是证明$ \ ell_0 $ - norm的稳健性，$ 1 $预测，或$ \ ell_2 $ - norm的稳健性，$ k $预测。在这项工作中，我们旨在弥合差距。我们的方法是基于随机平滑的，该平滑度通过随机化输入来从任意分类器中构建一个可证明的鲁棒分类器。我们的主要理论贡献是$ \ ell_0 $ norm认证的鲁棒性保证，可用于$ K $预测。我们从经验上评估了我们在CIFAR10和Imagenet上的方法。例如，当攻击者可以任意扰动测试映像的5像素时，我们的方法可以构建一个分类器，该分类器在Imagenet上达到69.2 \％的验证前3精度。

Top-k predictions are used in many real-world applications such as machine learning as a service, recommender systems, and web searches. $\ell_0$-norm adversarial perturbation characterizes an attack that arbitrarily modifies some features of an input such that a classifier makes an incorrect prediction for the perturbed input. $\ell_0$-norm adversarial perturbation is easy to interpret and can be implemented in the physical world. Therefore, certifying robustness of top-$k$ predictions against $\ell_0$-norm adversarial perturbation is important. However, existing studies either focused on certifying $\ell_0$-norm robustness of top-$1$ predictions or $\ell_2$-norm robustness of top-$k$ predictions. In this work, we aim to bridge the gap. Our approach is based on randomized smoothing, which builds a provably robust classifier from an arbitrary classifier via randomizing an input. Our major theoretical contribution is an almost tight $\ell_0$-norm certified robustness guarantee for top-$k$ predictions. We empirically evaluate our method on CIFAR10 and ImageNet. For instance, our method can build a classifier that achieves a certified top-3 accuracy of 69.2\% on ImageNet when an attacker can arbitrarily perturb 5 pixels of a testing image.

下载PDF全文

下载文献需遵守相关版权规定

论文标题