论文标题
通过轨迹优化和强化学习来解决具有挑战性的灵巧操纵任务
Solving Challenging Dexterous Manipulation Tasks With Trajectory Optimisation and Reinforcement Learning
论文作者
论文摘要
培训代理人自主学习如何使用拟人化机器人手有可能导致能够在非结构化和不确定的环境中执行多种复杂的操纵任务的系统。在这项工作中,我们首先介绍了一系列具有挑战性的模拟操作任务,当前的强化学习和轨迹优化技术很困难。这些环境包括两个模拟的双手必须通过彼此之间的物体或抛弃对象,以及代理必须学会在手指之间旋转长笔的环境。然后,我们引入了一个简单的轨迹优化,该优化的性能明显优于这些环境上的现有方法。最后,在具有挑战性的penspin任务上,我们将通过轨迹优化产生的亚最佳示范与非政策强化学习相结合,获得了远远超过这些方法中的任何一种,从而有效地解决了环境。我们所有结果的视频均可在以下网址提供:https://dexterous-manipulation.github.io/
Training agents to autonomously learn how to use anthropomorphic robotic hands has the potential to lead to systems capable of performing a multitude of complex manipulation tasks in unstructured and uncertain environments. In this work, we first introduce a suite of challenging simulated manipulation tasks that current reinforcement learning and trajectory optimisation techniques find difficult. These include environments where two simulated hands have to pass or throw objects between each other, as well as an environment where the agent must learn to spin a long pen between its fingers. We then introduce a simple trajectory optimisation that performs significantly better than existing methods on these environments. Finally, on the challenging PenSpin task we combine sub-optimal demonstrations generated through trajectory optimisation with off-policy reinforcement learning, obtaining performance that far exceeds either of these approaches individually, effectively solving the environment. Videos of all of our results are available at: https://dexterous-manipulation.github.io/