ARIA：内容出处的对抗性强大的图像归因

论文标题

ARIA：内容出处的对抗性强大的图像归因

ARIA: Adversarially Robust Image Attribution for Content Provenance

论文作者

Andriushchenko, Maksym, Li, Xiaoyang Rebecca, Oxholm, Geoffrey, Gittings, Thomas, Bui, Tu, Flammarion, Nicolas, Collomosse, John

论文摘要

图像归因 - 将图像匹配回到受信任的来源 - 是与在线误解作斗争的新兴工具。为此，最近探索了深层视觉指纹模型。但是，它们对称为对抗性示例的微小输入扰动并不强大。首先，我们说明如何生成有效的对抗图像，这些图像很容易导致不正确的图像归因。然后，我们描述了一种通过强大的对比度学习来防止对深层视觉指纹模型的不可察觉的对抗攻击的方法。拟议的培训程序利用$ \ ell_ \ infty $结合的对抗性示例进行培训，这在概念上很简单，并且仅造成一个小的计算开销。所得的模型基本上更健壮，即使在不受干扰的图像上也是准确的，并且在具有数百万张图像的数据库上也表现出色。特别是，我们在$ \ ell_ \ infty $ and $ conly的图像上达到了91.6％的标准和85.1％的对抗性召回，而在受操纵的图像上的扰动为80.1％和0.0％。我们还表明，鲁棒性概括为在训练过程中看不见的其他类型的不受欢迎的扰动。最后，我们展示了如何训练对抗性强大的图像比较器模型来检测匹配图像中的编辑变化。

Image attribution -- matching an image back to a trusted source -- is an emerging tool in the fight against online misinformation. Deep visual fingerprinting models have recently been explored for this purpose. However, they are not robust to tiny input perturbations known as adversarial examples. First we illustrate how to generate valid adversarial images that can easily cause incorrect image attribution. Then we describe an approach to prevent imperceptible adversarial attacks on deep visual fingerprinting models, via robust contrastive learning. The proposed training procedure leverages training on $\ell_\infty$-bounded adversarial examples, it is conceptually simple and incurs only a small computational overhead. The resulting models are substantially more robust, are accurate even on unperturbed images, and perform well even over a database with millions of images. In particular, we achieve 91.6% standard and 85.1% adversarial recall under $\ell_\infty$-bounded perturbations on manipulated images compared to 80.1% and 0.0% from prior work. We also show that robustness generalizes to other types of imperceptible perturbations unseen during training. Finally, we show how to train an adversarially robust image comparator model for detecting editorial changes in matched images.

下载PDF全文

下载文献需遵守相关版权规定

论文标题