使用基于图的策略学习开放临时团队合作

论文标题

使用基于图的策略学习开放临时团队合作

Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning

论文作者

Rahman, Arrasy, Höpner, Niklas, Christianos, Filippos, Albrecht, Stefano V.

论文摘要

临时团队合作是设计一个自治代理的挑战性问题，该机构可以迅速适应与队友合作而没有事先协调机制，包括联合培训。该领域的先前工作集中在固定代理数量的封闭团队上。在这项工作中，我们通过允许具有不同固定策略的代理商进入并离开环境而无需事先通知来考虑开放团队。我们的解决方案建立在图形神经网络上，以学习不同团队组成下的代理模型和联合行动价值模型。我们贡献了一种新型的动作值计算，该计算将代理模型和联合行动值模型集成以产生动作值估计。我们从经验上证明，我们的方法成功地模拟了其他代理人对学习者的影响，从而导致政策强烈适应动态团队的组成，并显着优于几种替代方法。

Ad hoc teamwork is the challenging problem of designing an autonomous agent which can adapt quickly to collaborate with teammates without prior coordination mechanisms, including joint training. Prior work in this area has focused on closed teams in which the number of agents is fixed. In this work, we consider open teams by allowing agents with different fixed policies to enter and leave the environment without prior notification. Our solution builds on graph neural networks to learn agent models and joint-action value models under varying team compositions. We contribute a novel action-value computation that integrates the agent model and joint-action value model to produce action-value estimates. We empirically demonstrate that our approach successfully models the effects other agents have on the learner, leading to policies that robustly adapt to dynamic team compositions and significantly outperform several alternative methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题