论文标题
遵循对象:具有想象目标的操纵任务的课程学习
Follow the Object: Curriculum Learning for Manipulation Tasks with Imagined Goals
论文作者
论文摘要
通过在稀疏奖励的环境中深入的加强学习来学习机器人操纵是一项艰巨的任务。在本文中,我们通过引入一个虚构的对象目标来解决这个问题。对于给定的操纵任务,首先对感兴趣的对象进行了培训,以通过身体上现实的模拟而无需操纵即可独自达到所需的目标位置。然后,将对象策略借用以建立一个合理的对象轨迹的预测模型,从而为机器人提供了在培训期间要达到的渐进对象目标的课程。遵循对象(FO)所提出的算法已在需要提高探索程度的7个Mujoco环境中进行了评估,并且与替代算法相比,成功率更高。在特别具有挑战性的学习情况下,例如如果对象的初始位置和目标位置相距甚远,那么我们的方法仍然可以学习政策,而竞争方法当前失败。
Learning robot manipulation through deep reinforcement learning in environments with sparse rewards is a challenging task. In this paper we address this problem by introducing a notion of imaginary object goals. For a given manipulation task, the object of interest is first trained to reach a desired target position on its own, without being manipulated, through physically realistic simulations. The object policy is then leveraged to build a predictive model of plausible object trajectories providing the robot with a curriculum of incrementally more difficult object goals to reach during training. The proposed algorithm, Follow the Object (FO), has been evaluated on 7 MuJoCo environments requiring increasing degree of exploration, and has achieved higher success rates compared to alternative algorithms. In particularly challenging learning scenarios, e.g. where the object's initial and target positions are far apart, our approach can still learn a policy whereas competing methods currently fail.