基于启发式措施的深度Q网络的衰减探索的重新衰变探索

论文标题

基于启发式措施的深度Q网络的衰减探索的重新衰变探索

Reannealing of Decaying Exploration Based On Heuristic Measure in Deep Q-Network

论文作者

Wang, Xing, Vinel, Alexander

论文摘要

强化学习（RL）的现有探索策略通常忽略搜索的历史或反馈，或者实施复杂。也有非常有限的文献显示它们对不同领域的有效性。我们根据Reannealing的思想提出了一种算法，该算法旨在仅在需要时才能鼓励探索，例如，当算法检测到代理是否粘在局部最佳距离时。该方法易于实现。我们进行了一个说明性的案例研究，表明它有可能加速培训并获得更好的政策。

Existing exploration strategies in reinforcement learning (RL) often either ignore the history or feedback of search, or are complicated to implement. There is also a very limited literature showing their effectiveness over diverse domains. We propose an algorithm based on the idea of reannealing, that aims at encouraging exploration only when it is needed, for example, when the algorithm detects that the agent is stuck in a local optimum. The approach is simple to implement. We perform an illustrative case study showing that it has potential to both accelerate training and obtain a better policy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题