进入未来的关键点：基于模型的强化学习中的自我监督的对应关系

论文标题

进入未来的关键点：基于模型的强化学习中的自我监督的对应关系

Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning

论文作者

Manuelli, Lucas, Li, Yunzhu, Florence, Pete, Tedrake, Russ

论文摘要

预测模型一直是许多机器人系统的核心，从四四个到步行机器人。然而，由于图像等高维感觉观察，开发和应用此类模型在实用机器人操作中一直具有挑战性。在机器人操纵背景下学习模型的先前方法已经学习了整个图像动力学，或者使用自动编码器在低维潜在状态下学习动力学。在这项工作中，我们通过自我监督的视觉对应学习介绍了基于模型的预测，并表明这确实是可能的，而且证明这些类型的预测模型对基于自动编码器型视觉训练的基于视觉的RL的替代方法表现出了令人信服的性能改进。通过仿真实验，我们证明了我们的模型提供了更好的概括精度，尤其是在3D场景，涉及遮挡的场景和类别中的场景中。此外，我们通过硬件实验有效地验证了我们的方法可以有效地传输到现实世界。视频和补充材料可在https://sites.google.com/view/keypointsintothefuture中获得

Predictive models have been at the core of many robotic systems, from quadrotors to walking robots. However, it has been challenging to develop and apply such models to practical robotic manipulation due to high-dimensional sensory observations such as images. Previous approaches to learning models in the context of robotic manipulation have either learned whole image dynamics or used autoencoders to learn dynamics in a low-dimensional latent state. In this work, we introduce model-based prediction with self-supervised visual correspondence learning, and show that not only is this indeed possible, but demonstrate that these types of predictive models show compelling performance improvements over alternative methods for vision-based RL with autoencoder-type vision training. Through simulation experiments, we demonstrate that our models provide better generalization precision, particularly in 3D scenes, scenes involving occlusion, and in category-generalization. Additionally, we validate that our method effectively transfers to the real world through hardware experiments. Videos and supplementary materials available at https://sites.google.com/view/keypointsintothefuture

下载PDF全文

下载文献需遵守相关版权规定

论文标题