基于GAN的内在探索样品有效增强学习

论文标题

基于GAN的内在探索样品有效增强学习

GAN-based Intrinsic Exploration For Sample Efficient Reinforcement Learning

论文作者

Kamar, Doğay, Üre, Nazım Kemal, Ünal, Gözde

论文摘要

在这项研究中，我们解决了增强学习中有效探索的问题。最常见的探索方法取决于随机行动的选择，但是这些方法在稀疏或没有奖励的环境中效果不佳。我们提出了基于生成的对抗网络的内在奖励模块，该模块了解观察到的状态的分布，并发送一个内在的奖励，该奖励是针对分布式不足的状态计算出的较高的，以便将代理人带到未开发的状态。我们在超级马里奥兄弟（Super Mario Bros）中评估了我们的方法，以获取无奖励的环境，并在蒙特祖玛（Montezuma）的报仇中为稀疏的奖励设置进行了报复，并表明我们的方法确实能够有效地探索。我们讨论了一些弱点，并通过讨论未来的作品来得出结论。

In this study, we address the problem of efficient exploration in reinforcement learning. Most common exploration approaches depend on random action selection, however these approaches do not work well in environments with sparse or no rewards. We propose Generative Adversarial Network-based Intrinsic Reward Module that learns the distribution of the observed states and sends an intrinsic reward that is computed as high for states that are out of distribution, in order to lead agent to unexplored states. We evaluate our approach in Super Mario Bros for a no reward setting and in Montezuma's Revenge for a sparse reward setting and show that our approach is indeed capable of exploring efficiently. We discuss a few weaknesses and conclude by discussing future works.

下载PDF全文

下载文献需遵守相关版权规定

论文标题