对多代理系统中对抗性交流的坚固策略学习

论文标题

对多代理系统中对抗性交流的坚固策略学习

Certifiably Robust Policy Learning against Adversarial Communication in Multi-agent Systems

论文作者

Sun, Yanchao, Zheng, Ruijie, Hassanzadeh, Parisa, Liang, Yongyuan, Feizi, Soheil, Ganesh, Sumitra, Huang, Furong

论文摘要

沟通对于代理商共享信息并做出良好决定的许多多机构增强学习（MARL）问题很重要。但是，当在存在噪音和潜在攻击者的现实应用程序中部署训练有素的交流代理商时，基于沟通的政策的安全就会成为一个严重的问题，而这一严重的问题是没有被忽视的。具体而言，如果通过恶意攻击者操纵沟通信息，那么依靠不信任的交流的代理可能会采取不安全的行动，从而导致灾难性后果。因此，至关重要的是要确保代理人不会被腐败的沟通误导，同时仍然从良性交流中受益。在这项工作中，我们考虑了一个具有$ n $ agents的环境，攻击者可以任意将通信从任何$ c <\ frac {n-1} {2} $代理转换为受害者代理。对于这种强大的威胁模型，我们通过构建一个汇总多个随机消融的消息集的消息汇总策略来提出可认证的辩护。理论分析表明，这种消息安装策略可以利用良性通信，而无论攻击算法如何，都可以证明对对抗性交流具有稳健性。在多种环境中的实验证明，我们的防御能够显着提高受过训练的政策对各种攻击的鲁棒性。

Communication is important in many multi-agent reinforcement learning (MARL) problems for agents to share information and make good decisions. However, when deploying trained communicative agents in a real-world application where noise and potential attackers exist, the safety of communication-based policies becomes a severe issue that is underexplored. Specifically, if communication messages are manipulated by malicious attackers, agents relying on untrustworthy communication may take unsafe actions that lead to catastrophic consequences. Therefore, it is crucial to ensure that agents will not be misled by corrupted communication, while still benefiting from benign communication. In this work, we consider an environment with $N$ agents, where the attacker may arbitrarily change the communication from any $C<\frac{N-1}{2}$ agents to a victim agent. For this strong threat model, we propose a certifiable defense by constructing a message-ensemble policy that aggregates multiple randomly ablated message sets. Theoretical analysis shows that this message-ensemble policy can utilize benign communication while being certifiably robust to adversarial communication, regardless of the attacking algorithm. Experiments in multiple environments verify that our defense significantly improves the robustness of trained policies against various types of attacks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题