论文标题
使用深入的强化学习,分散的多机构追求
Decentralized Multi-Agent Pursuit using Deep Reinforcement Learning
论文作者
论文摘要
追求逃避是用一个或多个追随者捕获移动目标的问题。我们使用深度强化学习来追求具有多种均匀的代理的全向目标目标,这些目标受到独轮车运动限制的约束。我们使用共享的经验来培训给定数量的追随者的政策,这些策略在运行时由每个代理人独立执行。培训受益于课程学习,这是一种散布的角度订单,可以在当地代表邻近的代理商,并通过结合个人和团体奖励的奖励结构来鼓励良好的形成。用反应性逃避器和多达八个追随者进行了模拟实验,表明我们基于学习的方法,具有非全面的代理,与具有综合剂的经典算法相同,并且超过了其非全面的适应性。在概念验证示范中,通过三个运动受限的追随者无人机成功地转移到现实世界中。
Pursuit-evasion is the problem of capturing mobile targets with one or more pursuers. We use deep reinforcement learning for pursuing an omni-directional target with multiple, homogeneous agents that are subject to unicycle kinematic constraints. We use shared experience to train a policy for a given number of pursuers that is executed independently by each agent at run-time. The training benefits from curriculum learning, a sweeping-angle ordering to locally represent neighboring agents and encouraging good formations with reward structure that combines individual and group rewards. Simulated experiments with a reactive evader and up to eight pursuers show that our learning-based approach, with non-holonomic agents, performs on par with classical algorithms with omni-directional agents, and outperforms their non-holonomic adaptations. The learned policy is successfully transferred to the real world in a proof-of-concept demonstration with three motion-constrained pursuer drones.