基于模型的深度强化学习的桥接想象力和现实

论文标题

基于模型的深度强化学习的桥接想象力和现实

Bridging Imagination and Reality for Model-Based Deep Reinforcement Learning

论文作者

Zhu, Guangxiang, Zhang, Minghao, Lee, Honglak, Zhang, Chongjie

论文摘要

样本效率一直是深入增强学习的主要挑战之一。最近，已经提出了基于模型的强化学习来通过学习的世界模型对虚构轨迹进行计划，以应对这一挑战。但是，世界模型学习可能会遭受过度拟合到训练轨迹的困扰，因此，基于模型的价值估计和政策搜索将被陷入较低的本地政策中。在本文中，我们提出了一种基于模型的新型增强学习算法，称为“桥接现实与梦（Bird）”。它最大程度地提高了虚构轨迹和实际轨迹之间的相互信息，以便从虚构轨迹中学到的策略改进可以很容易地推广到实际轨迹。我们证明我们的方法提高了基于模型的计划的样本效率，并在具有挑战性的视觉控制基准方面实现了最先进的性能。

Sample efficiency has been one of the major challenges for deep reinforcement learning. Recently, model-based reinforcement learning has been proposed to address this challenge by performing planning on imaginary trajectories with a learned world model. However, world model learning may suffer from overfitting to training trajectories, and thus model-based value estimation and policy search will be pone to be sucked in an inferior local policy. In this paper, we propose a novel model-based reinforcement learning algorithm, called BrIdging Reality and Dream (BIRD). It maximizes the mutual information between imaginary and real trajectories so that the policy improvement learned from imaginary trajectories can be easily generalized to real trajectories. We demonstrate that our approach improves sample efficiency of model-based planning, and achieves state-of-the-art performance on challenging visual control benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题