复杂的机器人操作通过基于图的后视目标产生

论文标题

复杂的机器人操作通过基于图的后视目标产生

Complex Robotic Manipulation via Graph-Based Hindsight Goal Generation

论文作者

Bing, Zhenshan, Brucker, Matthias, Morin, Fabrice O., Huang, Kai, Knoll, Alois

论文摘要

强化学习算法（例如事后经验经验重播（她）和事后目标产生（HGG））能够在具有稀疏奖励的多目标环境中解决挑战性的机器人操纵任务。她的训练成功通过了启发式目标的过去经验的重播，从而取得了成功，但是在挑战性的任务中，难以探索目标。 HGG通过选择在短期内易于实现的中级目标来增强她，并有望长期实现目标目标。这种指导的探索使HGG适用于目标目标远离对象的初始位置的任务。但是，HGG不适用于具有障碍物的操纵任务，因为用于HGG的欧几里得指标在这种环境中不是准确的距离度量。在本文中，我们提出了基于图的后代目标生成（G-HGG），这是基于避免障碍物图中最短距离的HGG选择后观察目标的扩展，这是环境的离散表示。我们评估了G-HGG在四项具有障碍的具有挑战性的操纵任务上，在HGG和她的情况下，样本效率和总体成功率的显着提高。可以在https://sites.google.com/view/demos-g-hgg/上查看视频。

Reinforcement learning algorithms such as hindsight experience replay (HER) and hindsight goal generation (HGG) have been able to solve challenging robotic manipulation tasks in multi-goal settings with sparse rewards. HER achieves its training success through hindsight replays of past experience with heuristic goals, but under-performs in challenging tasks in which goals are difficult to explore. HGG enhances HER by selecting intermediate goals that are easy to achieve in the short term and promising to lead to target goals in the long term. This guided exploration makes HGG applicable to tasks in which target goals are far away from the object's initial position. However, HGG is not applicable to manipulation tasks with obstacles because the euclidean metric used for HGG is not an accurate distance metric in such environments. In this paper, we propose graph-based hindsight goal generation (G-HGG), an extension of HGG selecting hindsight goals based on shortest distances in an obstacle-avoiding graph, which is a discrete representation of the environment. We evaluated G-HGG on four challenging manipulation tasks with obstacles, where significant enhancements in both sample efficiency and overall success rate are shown over HGG and HER. Videos can be viewed at https://sites.google.com/view/demos-g-hgg/.

下载PDF全文

下载文献需遵守相关版权规定

论文标题