如何缩小SIM真实差距？分段转移！

论文标题

如何缩小SIM真实差距？分段转移！

How to Close Sim-Real Gap? Transfer with Segmentation!

论文作者

Yan, Mengyuan, Sun, Qingyun, Frosio, Iuri, Tyree, Stephen, Kautz, Jan

论文摘要

机器人学习中的一个基本困难是SIM真实差距问题。在这项工作中，我们建议将分割用作感知和控制之间的接口，作为域名状态表示。我们确定了两个SIM真实差距的来源，一个是Dynamics SIM真实差距，另一个是视觉SIM真实差距。为了缩小动力学SIM真实差距，我们建议使用闭环控制。对于使用分割掩码输入的复杂任务，我们进一步建议使用模仿学习，通过深度神经网络学习闭环模型的控制策略。为了弥合视觉模拟差距，我们建议使用模拟目标加上真实背景图像在真实环境中学习一个感知模型，而无需使用任何现实世界的监督。我们在掌握的任务中演示了这种方法。我们训练一个闭环控制策略模型，该模型将细分作为输入使用模拟。我们表明，该控制策略能够从模拟转移到真实环境。闭环控制策略不仅在模拟和真实机器人的动态模型之间的差异方面不仅是强大的，而且还能够概括到目标正在移动的情况下，甚至学习从失败中恢复。我们使用通过与目标的模拟图像组成真实背景图像生成的训练数据来训练感知分割模型。将从模拟中学到的控制策略与感知模型相结合，我们获得了令人印象深刻的$ \ bf {88 \％} $成功率在用真正的机器人抓住一个小球体时。

One fundamental difficulty in robotic learning is the sim-real gap problem. In this work, we propose to use segmentation as the interface between perception and control, as a domain-invariant state representation. We identify two sources of sim-real gap, one is dynamics sim-real gap, the other is visual sim-real gap. To close dynamics sim-real gap, we propose to use closed-loop control. For complex task with segmentation mask input, we further propose to learn a closed-loop model-free control policy with deep neural network using imitation learning. To close visual sim-real gap, we propose to learn a perception model in real environment using simulated target plus real background image, without using any real world supervision. We demonstrate this methodology in eye-in-hand grasping task. We train a closed-loop control policy model that taking the segmentation as input using simulation. We show that this control policy is able to transfer from simulation to real environment. The closed-loop control policy is not only robust with respect to discrepancies between the dynamic model of the simulated and real robot, but also is able to generalize to unseen scenarios where the target is moving and even learns to recover from failures. We train the perception segmentation model using training data generated by composing real background images with simulated images of the target. Combining the control policy learned from simulation with the perception model, we achieve an impressive $\bf{88\%}$ success rate in grasping a tiny sphere with a real robot.

下载PDF全文

下载文献需遵守相关版权规定

论文标题