用于加固学习的统一国家抽象

论文标题

用于加固学习的统一国家抽象

Uniform State Abstraction For Reinforcement Learning

论文作者

Burden, John, Kudenko, Daniel

论文摘要

潜在的基于基于适当定义的抽象知识的潜在功能结合了潜在的奖励成型，已显示出可显着提高增强学习中的学习速度。跨部增强学习（MRL）进一步表明，这种潜在功能形式的这种抽象知识几乎可以仅从与环境的代理相互作用中学习。但是，我们表明，MRL面临的问题是不伸展到深入学习。在本文中，我们扩展并改善了MRL，以利用现代深度学习算法，例如深Q-Networks（DQN）。我们表明，与我们的持续控制任务相比，DQN在连续控制任务上的表现要好得多，而DQN与MRL相比增强。

Potential Based Reward Shaping combined with a potential function based on appropriately defined abstract knowledge has been shown to significantly improve learning speed in Reinforcement Learning. MultiGrid Reinforcement Learning (MRL) has further shown that such abstract knowledge in the form of a potential function can be learned almost solely from agent interaction with the environment. However, we show that MRL faces the problem of not extending well to work with Deep Learning. In this paper we extend and improve MRL to take advantage of modern Deep Learning algorithms such as Deep Q-Networks (DQN). We show that DQN augmented with our approach perform significantly better on continuous control tasks than its Vanilla counterpart and DQN augmented with MRL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题