论文标题

软演员评论的加权熵修饰

Weighted Entropy Modification for Soft Actor-Critic

论文作者

Zhao, Yizhou, Zhu, Song-Chun

论文摘要

我们通过用一些定性权重来表征国家行动对,将增强术学习(RL)中最大的香农熵(RL)的现有原理概括为加权熵,这些权重可以与策略的先验知识,经验重播和演变过程相连。我们提出了一种通过引入的重量功能进行自平衡探索的算法,尽管其实施方面简单,但仍导致在Mujoco任务上的最新性能。

We generalize the existing principle of the maximum Shannon entropy in reinforcement learning (RL) to weighted entropy by characterizing the state-action pairs with some qualitative weights, which can be connected with prior knowledge, experience replay, and evolution process of the policy. We propose an algorithm motivated for self-balancing exploration with the introduced weight function, which leads to state-of-the-art performance on Mujoco tasks despite its simplicity in implementation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源