Poishygiene：检测和减轻神经网络中的中毒攻击

论文标题

Poishygiene：检测和减轻神经网络中的中毒攻击

PoisHygiene: Detecting and Mitigating Poisoning Attacks in Neural Networks

论文作者

Guo, Junfeng, Wang, Ting, Liu, Cong

论文摘要

深神经网络（DNNS）的黑盒性质有助于攻击者通过数据中毒操纵DNN的行为。能够检测和减轻中毒攻击，通常分为后门和对抗中毒（AP），对于在许多应用域中可以安全地采用DNN至关重要。尽管最近的作品表明对某些后门攻击的检测结果令人鼓舞，但它们表现出固有的局限性，可能会严重限制适用性。实际上，没有任何技术来检测AP攻击，这代表了一个艰巨的挑战，因为这种攻击在后门攻击时没有常见和明确的规则（即将后门触发器嵌入中有毒数据中）。我们认为，检测和减轻AP攻击的关键是观察和利用受感染DNN模型中必不可少的中毒引起的特性的能力。在本文中，我们介绍了Poishygiene，这是针对AP攻击的第一个有效且可靠的发现和缓解框架。 Poishygiene从根本上是由欧内斯特·卢瑟福（Ernest Rutherford）博士的故事（即1908年诺贝尔奖获奖者）引起的，它是通过随机电子采样来观察原子的结构。

The black-box nature of deep neural networks (DNNs) facilitates attackers to manipulate the behavior of DNN through data poisoning. Being able to detect and mitigate poisoning attacks, typically categorized into backdoor and adversarial poisoning (AP), is critical in enabling safe adoption of DNNs in many application domains. Although recent works demonstrate encouraging results on detection of certain backdoor attacks, they exhibit inherent limitations which may significantly constrain the applicability. Indeed, no technique exists for detecting AP attacks, which represents a harder challenge given that such attacks exhibit no common and explicit rules while backdoor attacks do (i.e., embedding backdoor triggers into poisoned data). We believe the key to detect and mitigate AP attacks is the capability of observing and leveraging essential poisoning-induced properties within an infected DNN model. In this paper, we present PoisHygiene, the first effective and robust detection and mitigation framework against AP attacks. PoisHygiene is fundamentally motivated by Dr. Ernest Rutherford's story (i.e., the 1908 Nobel Prize winner), on observing the structure of atom through random electron sampling.

下载PDF全文

下载文献需遵守相关版权规定

论文标题