通过学习的技能先验，加速加强学习

论文标题

通过学习的技能先验，加速加强学习

Accelerating Reinforcement Learning with Learned Skill Priors

论文作者

Pertsch, Karl, Lee, Youngwoon, Lim, Joseph J.

论文摘要

智能代理在学习新任务时很大程度上依赖先前的经验，但是大多数现代的强化学习（RL）方法从头开始学习每个任务。利用先验知识的一种方法是将有关先验任务的技能转移到新任务。但是，随着先前经验的数量的增加，可转移技能的数量也增加了，因此在下游学习过程中探索全套可用技能的挑战。但是，直觉上，并非所有技能都应以同等的可能性进行探索；例如，有关当前状态的信息可以暗示哪些技能有望探索。在这项工作中，我们建议通过学习先前的技能来实现这一直觉。我们提出了一个深层的可变模型，该模型共同学习了技能的嵌入空间以及离线代理体验的先验技能。然后，我们扩展了使用技能先验的通用最大回流方法来指导下游学习。我们在复杂的导航和机器人操纵任务上验证了我们的方法，Spirl（技能优先RL），并表明学习技能先验对于从丰富的数据集中有效的技能转移至关重要。视频和代码可在https://clvrai.com/spirl上找到。

Intelligent agents rely heavily on prior experience when learning a new task, yet most modern reinforcement learning (RL) approaches learn every task from scratch. One approach for leveraging prior knowledge is to transfer skills learned on prior tasks to the new task. However, as the amount of prior experience increases, the number of transferable skills grows too, making it challenging to explore the full set of available skills during downstream learning. Yet, intuitively, not all skills should be explored with equal probability; for example information about the current state can hint which skills are promising to explore. In this work, we propose to implement this intuition by learning a prior over skills. We propose a deep latent variable model that jointly learns an embedding space of skills and the skill prior from offline agent experience. We then extend common maximum-entropy RL approaches to use skill priors to guide downstream learning. We validate our approach, SPiRL (Skill-Prior RL), on complex navigation and robotic manipulation tasks and show that learned skill priors are essential for effective skill transfer from rich datasets. Videos and code are available at https://clvrai.com/spirl.

下载PDF全文

下载文献需遵守相关版权规定

论文标题