论文标题
实现导航剂的SIM2REAL转移的双向域的适应
Bi-directional Domain Adaptation for Sim2Real Transfer of Embodied Navigation Agents
论文作者
论文摘要
众所周知,深厚的加强学习模型是饥饿的数据,但是现实世界中的数据既昂贵又耗时。许多人转向的解决方案是在将机器人部署在真实环境中之前使用仿真进行训练。仿真提供了并行训练大量机器人的能力,并提供大量数据。但是,没有模拟是完美的,并且仅在模拟中训练的机器人无法推广到现实世界,从而产生了“ SIM-VS真实差距”。我们如何克服来自模拟器中的丰富精确,人工数据的折衷与可靠的现实世界数据之间的稀缺?在本文中,我们提出了双向域的适应性(BDA),这是一种在两个方向上桥接SIM-VS真实间隙的新方法-Real2SIM来弥合视觉域间隙,SIM2REAL以弥合动态域间隙。我们证明了BDA对PointGoal导航任务的好处。 BDA只有5K现实世界(状态,动作,下一个国家)样本与〜600K样本进行微调的策略的性能匹配,从而加快了约120倍的速度。
Deep reinforcement learning models are notoriously data hungry, yet real-world data is expensive and time consuming to obtain. The solution that many have turned to is to use simulation for training before deploying the robot in a real environment. Simulation offers the ability to train large numbers of robots in parallel, and offers an abundance of data. However, no simulation is perfect, and robots trained solely in simulation fail to generalize to the real-world, resulting in a "sim-vs-real gap". How can we overcome the trade-off between the abundance of less accurate, artificial data from simulators and the scarcity of reliable, real-world data? In this paper, we propose Bi-directional Domain Adaptation (BDA), a novel approach to bridge the sim-vs-real gap in both directions -- real2sim to bridge the visual domain gap, and sim2real to bridge the dynamics domain gap. We demonstrate the benefits of BDA on the task of PointGoal Navigation. BDA with only 5k real-world (state, action, next-state) samples matches the performance of a policy fine-tuned with ~600k samples, resulting in a speed-up of ~120x.