论文标题
分配加固学习中的风险透视探索
Risk Perspective Exploration in Distributional Reinforcement Learning
论文作者
论文摘要
分布强化学习表明,具有差异和风险的特征,可用于探索。但是,尽管分布RL中的许多勘探方法采用了每项操作的回报分布的差异,但很难找到采用风险特性的勘探方法。在本文中,我们提出了风险调度方法,从风险角度来看,探讨风险水平和乐观行为。我们通过全面的实验在多代理设置中使用风险调度来证明DMIX算法的性能提高。
Distributional reinforcement learning demonstrates state-of-the-art performance in continuous and discrete control settings with the features of variance and risk, which can be used to explore. However, the exploration method employing the risk property is hard to find, although numerous exploration methods in Distributional RL employ the variance of return distribution per action. In this paper, we present risk scheduling approaches that explore risk levels and optimistic behaviors from a risk perspective. We demonstrate the performance enhancement of the DMIX algorithm using risk scheduling in a multi-agent setting with comprehensive experiments.