了解使用课程的单任务RL的复杂性提高

论文标题

了解使用课程的单任务RL的复杂性提高

Understanding the Complexity Gains of Single-Task RL with a Curriculum

论文作者

Li, Qiyang, Zhai, Yuexiang, Ma, Yi, Levine, Sergey

论文摘要

强化学习（RL）问题在没有形状良好的奖励的情况下可能具有挑战性。在可证明有效的RL方法上的先前工作通常建议通过专门的探索策略解决此问题。但是，应对这一挑战的另一种方法是将其重新将其重新定义为一个多任务的RL问题，在该问题中，任务空间不仅包含挑战性的感兴趣任务，而且还包含隐式用作课程的更轻松的任务。这样的重新印度打开了运行现有多任务RL方法的可能性，作为从头开始解决单个具有挑战性的任务的更有效替代方案。在这项工作中，我们提供了一个理论框架，该框架将单个任务RL问题重新制定为由课程定义的多任务RL问题。在课程中的轻度规律性条件下，我们表明，在多任务RL问题中依次解决每个任务比解决原始的单任务问题更有效，而没有任何明确的探索奖金或其他勘探策略。我们还表明，我们的理论见解可以转化为有效的实践学习算法，可以加速对模拟机器人任务的课程学习。

Reinforcement learning (RL) problems can be challenging without well-shaped rewards. Prior work on provably efficient RL methods generally proposes to address this issue with dedicated exploration strategies. However, another way to tackle this challenge is to reformulate it as a multi-task RL problem, where the task space contains not only the challenging task of interest but also easier tasks that implicitly function as a curriculum. Such a reformulation opens up the possibility of running existing multi-task RL methods as a more efficient alternative to solving a single challenging task from scratch. In this work, we provide a theoretical framework that reformulates a single-task RL problem as a multi-task RL problem defined by a curriculum. Under mild regularity conditions on the curriculum, we show that sequentially solving each task in the multi-task RL problem is more computationally efficient than solving the original single-task problem, without any explicit exploration bonuses or other exploration strategies. We also show that our theoretical insights can be translated into an effective practical learning algorithm that can accelerate curriculum learning on simulated robotic tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题