论文标题
一巴掌:用短暂的对抗扰动改善身体对抗示例
SLAP: Improving Physical Adversarial Examples with Short-Lived Adversarial Perturbations
论文作者
论文摘要
对对抗性例子(AE)的研究已经迅速发展,但是静态的对抗斑点仍然是在现实世界中进行攻击的主要技术,尽管一旦部署了一旦部署,但明显,半永久性且无法解码。 在本文中,我们提出了短暂的对抗扰动(Slap),这是一种新颖的技术,使对手可以通过使用轻投影仪实现身体上强大的现实世界AE。攻击者可以将特殊精心制作的对抗扰动投射到现实世界中,将其转换为AE。与对抗贴片相比,这使对手对攻击的控制更大:(i)可以随意动态打开和关闭或修改预测,(ii)预测不会受到补丁施加的局部性约束,从而使它们更难检测到。 我们研究了在自动驾驶场景中拍打的可行性,针对对象探测器和交通标志识别任务,重点是检测停车标志。我们在包括室外在内的各种环境光条件下进行实验,以表明在非亮度设置中,提出的方法如何生成非常健壮的AE,从而在最先进的网络上造成了错误的分类,其成功率高达99%的成功率和距离。我们还撤销了Slap-nater的AE不会显示在对抗斑块中看到的可检测行为,因此绕过Sentinet,一种物理AE检测方法。我们评估了其他防御能力,包括使用对抗性学习的自适应防御者,即使在有利的攻击者条件下,也能够阻止攻击效率高达80%。
Research into adversarial examples (AE) has developed rapidly, yet static adversarial patches are still the main technique for conducting attacks in the real world, despite being obvious, semi-permanent and unmodifiable once deployed. In this paper, we propose Short-Lived Adversarial Perturbations (SLAP), a novel technique that allows adversaries to realize physically robust real-world AE by using a light projector. Attackers can project a specifically crafted adversarial perturbation onto a real-world object, transforming it into an AE. This allows the adversary greater control over the attack compared to adversarial patches: (i) projections can be dynamically turned on and off or modified at will, (ii) projections do not suffer from the locality constraint imposed by patches, making them harder to detect. We study the feasibility of SLAP in the self-driving scenario, targeting both object detector and traffic sign recognition tasks, focusing on the detection of stop signs. We conduct experiments in a variety of ambient light conditions, including outdoors, showing how in non-bright settings the proposed method generates AE that are extremely robust, causing misclassifications on state-of-the-art networks with up to 99% success rate for a variety of angles and distances. We also demostrate that SLAP-generated AE do not present detectable behaviours seen in adversarial patches and therefore bypass SentiNet, a physical AE detection method. We evaluate other defences including an adaptive defender using adversarial learning which is able to thwart the attack effectiveness up to 80% even in favourable attacker conditions.