递归最小二乘策略控制与ECHO状态网络

论文标题

递归最小二乘策略控制与ECHO状态网络

Recursive Least Squares Policy Control with Echo State Network

论文作者

Zhang, Chunyuan, Liu, Chao, Song, Qi, Zhao, Jie

论文摘要

回声状态网络（ESN）是用于处理时间序列数据集的一种经常性神经网络。但是，受代理的顺序样本之间的强相关性的限制，基于ESN的策略控制算法很难使用递归最小二乘（RLS）算法来更新ESN的参数。为了解决这个问题，我们提出了两种新型的政策控制算法，即ESNRLS-Q和ESNRLS-SARSA。首先，为了减少训练样本的相关性，我们使用泄漏的积分器ESN和迷你批次学习模式。其次，为了使RLS适合在迷你批次模式下训练ESN，我们提出了一种新的均值示威方法，用于更新RLS相关矩阵。第三，为防止ESN过度拟合，我们使用L1正则化技术。最后，为防止目标状态行动值高估，我们采用了MellowMax方法。仿真结果表明，我们的算法具有良好的收敛性能。

The echo state network (ESN) is a special type of recurrent neural networks for processing the time-series dataset. However, limited by the strong correlation among sequential samples of the agent, ESN-based policy control algorithms are difficult to use the recursive least squares (RLS) algorithm to update the ESN's parameters. To solve this problem, we propose two novel policy control algorithms, ESNRLS-Q and ESNRLS-Sarsa. Firstly, to reduce the correlation of training samples, we use the leaky integrator ESN and the mini-batch learning mode. Secondly, to make RLS suitable for training ESN in mini-batch mode, we present a new mean-approximation method for updating the RLS correlation matrix. Thirdly, to prevent ESN from over-fitting, we use the L1 regularization technique. Lastly, to prevent the target state-action value from overestimation, we employ the Mellowmax method. Simulation results show that our algorithms have good convergence performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题