基于面具的潜在重建用于增强学习

论文标题

基于面具的潜在重建用于增强学习

Mask-based Latent Reconstruction for Reinforcement Learning

论文作者

Yu, Tao, Zhang, Zhizheng, Lan, Cuiling, Lu, Yan, Chen, Zhibo

论文摘要

对于从像素的深度加强学习（RL），学习有效的状态表示对于实现高性能至关重要。但是，实际上，有限的经验和高维投入可以阻止有效的表示学习。为了解决这一问题，是在其他研究领域基于面具的建模成功的推动的，我们引入了基于面具的重建，以促进RL中的国家代表性学习。具体而言，我们提出了一种简单但有效的自我监督方法，基于掩模的潜在重建（MLR），以从带有空间和时间掩盖的像素的观测值中预测潜在空间中的完整状态表示。 MLR在学习状态表示时可以更好地利用上下文信息，从而使它们更有用，从而有助于对RL代理进行培训。广泛的实验表明，我们的MLR显着提高了RL的样品效率，并在多个连续和离散的控制基准上胜过最先进的样品效率RL方法。我们的代码可在https://github.com/microsoft/mask-late-latent-reconstruction上找到。

For deep reinforcement learning (RL) from pixels, learning effective state representations is crucial for achieving high performance. However, in practice, limited experience and high-dimensional inputs prevent effective representation learning. To address this, motivated by the success of mask-based modeling in other research fields, we introduce mask-based reconstruction to promote state representation learning in RL. Specifically, we propose a simple yet effective self-supervised method, Mask-based Latent Reconstruction (MLR), to predict complete state representations in the latent space from the observations with spatially and temporally masked pixels. MLR enables better use of context information when learning state representations to make them more informative, which facilitates the training of RL agents. Extensive experiments show that our MLR significantly improves the sample efficiency in RL and outperforms the state-of-the-art sample-efficient RL methods on multiple continuous and discrete control benchmarks. Our code is available at https://github.com/microsoft/Mask-based-Latent-Reconstruction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题