论文标题
通过逆转触发因素解决基于后门的深神经网络标志的封封攻击
Solving the Capsulation Attack against Backdoor-based Deep Neural Network Watermarks by Reversing Triggers
论文作者
论文摘要
提出了基于后门的水印方案,以保护黑盒子设置下人工智能模型,尤其是深神经网络的知识产权。与普通的后门相比,基于后门的水印需要以数字方式纳入所有者的身份,事实为触发器的生成和验证程序增加了额外的要求。此外,这些问题在以取证工具出版后出版了水印计划后会产生额外的安全风险,或者已经窃听了所有者的证据。本文提出了Capsulation Attack,这是一种有效的方法,可以使最成熟的基于后门的水印方案无效,而无需牺牲盗版模型的功能。通过使用基于规则的或贝叶斯过滤器将深神经网络封装,对手可以阻止所有权探测并拒绝所有权验证。我们提出了一个指标,cascore,以衡量基于后门的水印方案的安全性,以防止倾斜攻击。本文还提出了一种新的基于后门的深神经网络水印方案,该方案可通过逆转编码过程和随机使触发器的暴露来确保抗封闭攻击。
Backdoor-based watermarking schemes were proposed to protect the intellectual property of artificial intelligence models, especially deep neural networks, under the black-box setting. Compared with ordinary backdoors, backdoor-based watermarks need to digitally incorporate the owner's identity, which fact adds extra requirements to the trigger generation and verification programs. Moreover, these concerns produce additional security risks after the watermarking scheme has been published for as a forensics tool or the owner's evidence has been eavesdropped on. This paper proposes the capsulation attack, an efficient method that can invalidate most established backdoor-based watermarking schemes without sacrificing the pirated model's functionality. By encapsulating the deep neural network with a rule-based or Bayes filter, an adversary can block ownership probing and reject the ownership verification. We propose a metric, CAScore, to measure a backdoor-based watermarking scheme's security against the capsulation attack. This paper also proposes a new backdoor-based deep neural network watermarking scheme that is secure against the capsulation attack by reversing the encoding process and randomizing the exposure of triggers.