论文标题
挡板:通过基于反馈的联邦学习的后门检测
BaFFLe: Backdoor detection via Feedback-based Federated Learning
论文作者
论文摘要
最近的研究表明,联邦学习(FL)容易受到将后门注入全球模型的中毒攻击的影响。即使是由单个客户执行的,这些攻击也是有效的,并且大多数现有的防御技术无法检测到。在本文中,我们通过基于反馈的联邦学习(挡板)提出了后门检测,这是一种新颖的防御,以保护FL,以防止后门攻击。挡板背后的核心思想是利用多个客户的数据不仅用于培训,而且要揭示模型中毒。我们通过将反馈循环纳入FL流程,在决定给定模型更新是否真实时集成这些客户的视图,从而利用各个客户端的各种数据集的可用性。我们表明,即使依靠直接的方法来验证模型,这种强大的构造也可以针对最新的后门攻击达到非常高的检测率。通过使用CIFAR-10和女权数据集的经验评估,我们表明,通过将反馈循环与一种方法相结合,该方法通过评估更新模型的每类分类性能来怀疑中毒尝试,薄薄器可靠地检测到最先进的后门攻击与100%和假效率低于5%的aart后门攻击的准确性。此外,我们表明我们的解决方案可以检测旨在绕过防御的自适应攻击。
Recent studies have shown that federated learning (FL) is vulnerable to poisoning attacks that inject a backdoor into the global model. These attacks are effective even when performed by a single client, and undetectable by most existing defensive techniques. In this paper, we propose Backdoor detection via Feedback-based Federated Learning (BAFFLE), a novel defense to secure FL against backdoor attacks. The core idea behind BAFFLE is to leverage data of multiple clients not only for training but also for uncovering model poisoning. We exploit the availability of diverse datasets at the various clients by incorporating a feedback loop into the FL process, to integrate the views of those clients when deciding whether a given model update is genuine or not. We show that this powerful construct can achieve very high detection rates against state-of-the-art backdoor attacks, even when relying on straightforward methods to validate the model. Through empirical evaluation using the CIFAR-10 and FEMNIST datasets, we show that by combining the feedback loop with a method that suspects poisoning attempts by assessing the per-class classification performance of the updated model, BAFFLE reliably detects state-of-the-art backdoor attacks with a detection accuracy of 100% and a false-positive rate below 5%. Moreover, we show that our solution can detect adaptive attacks aimed at bypassing the defense.