论文标题
在障碍之间学习牛群的代理:使用深度强化学习训练强大的牧羊行为
Learning to Herd Agents Amongst Obstacles: Training Robust Shepherding Behaviors using Deep Reinforcement Learning
论文作者
论文摘要
机器人牧羊问题考虑了一组连贯的代理(例如,一群鸟类或无人机群)的控制和导航,该机器人的运动称为Shepherd。基于机器学习的方法已经在没有障碍的空环境中成功解决了这个问题。另一方面,基于规则的方法可以处理更复杂的方案,在这种情况下,环境充满障碍,并允许多个牧羊人协作工作。但是,由于难以定义可以处理所有可能情况的全面规则,因此这些基于规则的方法是脆弱的。为了克服这些局限性,我们提出了第一个基于学习的方法,该方法可能会在障碍之间放映。通过使用深厚的加固学习技术与概率路线图相结合,我们使用嘈杂但受控的环境和行为参数训练牧羊模型。我们的实验结果表明,所提出的方法是鲁棒的,即,它对源自环境和行为模型的不确定性不敏感。因此,所提出的方法比基于规则的行为方法具有更高的成功率,更短的完成时间和路径长度。这些优势在更具挑战性的场景中特别突出,涉及更困难的群体和剧烈的通道。
Robotic shepherding problem considers the control and navigation of a group of coherent agents (e.g., a flock of bird or a fleet of drones) through the motion of an external robot, called shepherd. Machine learning based methods have successfully solved this problem in an empty environment with no obstacles. Rule-based methods, on the other hand, can handle more complex scenarios in which environments are cluttered with obstacles and allow multiple shepherds to work collaboratively. However, these rule-based methods are fragile due to the difficulty in defining a comprehensive set of rules that can handle all possible cases. To overcome these limitations, we propose the first known learning-based method that can herd agents amongst obstacles. By using deep reinforcement learning techniques combined with the probabilistic roadmaps, we train a shepherding model using noisy but controlled environmental and behavioral parameters. Our experimental results show that the proposed method is robust, namely, it is insensitive to the uncertainties originated from both environmental and behavioral models. Consequently, the proposed method has a higher success rate, shorter completion time and path length than the rule-based behavioral methods have. These advantages are particularly prominent in more challenging scenarios involving more difficult groups and strenuous passages.