使用任务不合时宜和以自我为中心的运动技能加速加强学习以进行自动驾驶

论文标题

使用任务不合时宜和以自我为中心的运动技能加速加强学习以进行自动驾驶

Accelerating Reinforcement Learning for Autonomous Driving using Task-Agnostic and Ego-Centric Motion Skills

论文作者

Zhou, Tong, Wang, Letian, Chen, Ruobing, Wang, Wenshuo, Liu, Yu

论文摘要

连续空间中有效有效的探索是将加固学习（RL）应用于自动驾驶的核心问题。从专家演示或为特定任务设计的技能可以使探索受益，但是它们通常是昂贵的，不平衡/优势的，或者未能转移到各种任务上。但是，人类驾驶员可以通过在整个技能空间中进行高效和结构性探索而不是具有特定于任务的技能的有限空间来适应各种驾驶任务。受上述事实的启发，我们提出了一种RL算法，以探索所有可行的运动技能，而不是一组有限的特定于任务和以对象为中心的技能。没有演示，我们的方法仍然可以在各种任务中表现出色。首先，我们以纯粹的运动角度构建了一个任务不合时宜的和以自我为中心的（TAEC）运动技能库，该运动技能库足够多样化，可以在不同的复杂任务中重复使用。然后，将运动技能编码为低维的潜在技能空间，其中RL可以有效地进行探索。在各种具有挑战性的驾驶场景中的验证表明，我们提出的方法TAEC-RL在学习效率和任务绩效方面的表现巨大。

Efficient and effective exploration in continuous space is a central problem in applying reinforcement learning (RL) to autonomous driving. Skills learned from expert demonstrations or designed for specific tasks can benefit the exploration, but they are usually costly-collected, unbalanced/sub-optimal, or failing to transfer to diverse tasks. However, human drivers can adapt to varied driving tasks without demonstrations by taking efficient and structural explorations in the entire skill space rather than a limited space with task-specific skills. Inspired by the above fact, we propose an RL algorithm exploring all feasible motion skills instead of a limited set of task-specific and object-centric skills. Without demonstrations, our method can still perform well in diverse tasks. First, we build a task-agnostic and ego-centric (TaEc) motion skill library in a pure motion perspective, which is diverse enough to be reusable in different complex tasks. The motion skills are then encoded into a low-dimension latent skill space, in which RL can do exploration efficiently. Validations in various challenging driving scenarios demonstrate that our proposed method, TaEc-RL, outperforms its counterparts significantly in learning efficiency and task performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题