论文标题
通过迭代推理进行加强学习,以合并浓厚的流量
Reinforcement Learning with Iterative Reasoning for Merging in Dense Traffic
论文作者
论文摘要
对于自动驾驶汽车而言,在密集的交通中进行操作是一项具有挑战性的任务,因为它需要对许多其他参与者的随机行为进行推理。此外,代理必须在有限的时间和距离内实现机动。在这项工作中,我们提出了增强学习和游戏理论的结合,以学习合并行为。我们使用级别$ K $行为的概念为增强学习代理设计培训课程。这种方法在训练过程中将代理暴露于各种各样的行为,这促进了对模型差异的强大学习政策。我们表明,我们的方法比传统的培训方法学习更有效的政策。
Maneuvering in dense traffic is a challenging task for autonomous vehicles because it requires reasoning about the stochastic behaviors of many other participants. In addition, the agent must achieve the maneuver within a limited time and distance. In this work, we propose a combination of reinforcement learning and game theory to learn merging behaviors. We design a training curriculum for a reinforcement learning agent using the concept of level-$k$ behavior. This approach exposes the agent to a broad variety of behaviors during training, which promotes learning policies that are robust to model discrepancies. We show that our approach learns more efficient policies than traditional training methods.