论文标题

不受信心的后门是韧性和隐形的后门

Under-confidence Backdoors Are Resilient and Stealthy Backdoors

论文作者

Peng, Minlong, Xiong, Zidi, Nguyen, Quang H., Sun, Mingming, Doan, Khoa D., Li, Ping

论文摘要

通过将少量中毒样本注入训练集中,后门攻击旨在使受害者模型在任何输入的任何输入中都产生设计的输出,并带有预先设计的后门。为了使用尽可能少的中毒训练样本获得高攻击成功率,大多数现有的攻击方法将中毒样本的标签更改为目标类别。这种做法通常会导致受害者模型在后门上的严重拟合,从而使攻击在输出控制方面非常有效,但可以通过人类检查或自动防御算法识别。 在这项工作中,我们提出了一种平滑标签的策略来克服这些攻击方法的过度问题,获得了\ textit {标签平滑的后门攻击}(LSBA)。在LSBA中,有毒样品的标签$ \ bm {x} $将以$ p_n(\ bm {x})$而不是100 \%的概率更改为目标类别,而不是100 \%,并且$ p_n(\ bm {x})$的值专门旨在使预测概率类别仅比其他类别的类别更大。对几项现有后门攻击的实证研究表明,我们的策略可以大大改善这些攻击的隐身性,同时又获得了很高的攻击成功率。此外,我们的策略能够通过操纵和激活的LSBAS \ footNote数量来手动控制设计输出的预测概率{源代码将在\ url {https://github.com/vithub.com/v-mipeng/v-mipeng/labelsmothedattack.git}}上发表。

By injecting a small number of poisoned samples into the training set, backdoor attacks aim to make the victim model produce designed outputs on any input injected with pre-designed backdoors. In order to achieve a high attack success rate using as few poisoned training samples as possible, most existing attack methods change the labels of the poisoned samples to the target class. This practice often results in severe over-fitting of the victim model over the backdoors, making the attack quite effective in output control but easier to be identified by human inspection or automatic defense algorithms. In this work, we proposed a label-smoothing strategy to overcome the over-fitting problem of these attack methods, obtaining a \textit{Label-Smoothed Backdoor Attack} (LSBA). In the LSBA, the label of the poisoned sample $\bm{x}$ will be changed to the target class with a probability of $p_n(\bm{x})$ instead of 100\%, and the value of $p_n(\bm{x})$ is specifically designed to make the prediction probability the target class be only slightly greater than those of the other classes. Empirical studies on several existing backdoor attacks show that our strategy can considerably improve the stealthiness of these attacks and, at the same time, achieve a high attack success rate. In addition, our strategy makes it able to manually control the prediction probability of the design output through manipulating the applied and activated number of LSBAs\footnote{Source code will be published at \url{https://github.com/v-mipeng/LabelSmoothedAttack.git}}.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源