表征联盟学习中的内部逃避攻击

论文标题

表征联盟学习中的内部逃避攻击

Characterizing Internal Evasion Attacks in Federated Learning

论文作者

Kim, Taejin, Singh, Shubhranshu, Madaan, Nikhil, Joe-Wong, Carlee

论文摘要

联合学习允许分布式系统中的客户共同培训机器学习模型。但是，客户的模型容易受到培训和测试阶段的攻击。在本文中，我们讨论了对抗“内部逃避攻击”的对抗性客户的问题：在测试时间进行逃避攻击以欺骗其他客户。例如，对手可能旨在欺骗垃圾邮件过滤器，并推荐使用联邦学习培训的推荐系统，以获取货币收益。对抗性客户在联合学习环境中拥有有关受害者模型的广泛信息，因为客户在客户之间分享了体重信息。我们是第一个表征这种内部逃避攻击在不同学习方法的转移性，并根据客户数据中相似性的程度分析模型准确性和鲁棒性之间的权衡。我们表明，在联合学习环境中，对抗性训练防御只能显示出有限的内部攻击改进。但是，与联邦对抗性训练相比，将对抗性训练与个性化联合学习框架相结合，将相对内部攻击鲁棒性提高了60％，并且在有限的系统资源下表现良好。

Federated learning allows for clients in a distributed system to jointly train a machine learning model. However, clients' models are vulnerable to attacks during the training and testing phases. In this paper, we address the issue of adversarial clients performing "internal evasion attacks": crafting evasion attacks at test time to deceive other clients. For example, adversaries may aim to deceive spam filters and recommendation systems trained with federated learning for monetary gain. The adversarial clients have extensive information about the victim model in a federated learning setting, as weight information is shared amongst clients. We are the first to characterize the transferability of such internal evasion attacks for different learning methods and analyze the trade-off between model accuracy and robustness depending on the degree of similarities in client data. We show that adversarial training defenses in the federated learning setting only display limited improvements against internal attacks. However, combining adversarial training with personalized federated learning frameworks increases relative internal attack robustness by 60% compared to federated adversarial training and performs well under limited system resources.

下载PDF全文

下载文献需遵守相关版权规定

论文标题