通过光谱归一化学习稳定性认证的增强学习

论文标题

通过光谱归一化学习稳定性认证的增强学习

Stability-Certified Reinforcement Learning via Spectral Normalization

论文作者

Takase, Ryoichi, Yoshikawa, Nobuyuki, Mariyama, Toshisada, Tsuchiya, Takeshi

论文摘要

在本文中，描述了从光谱归一化的不同角度的两种类型的方法来确保由神经网络控制的系统的稳定性。第一个是反馈系统的L2增益小于1，以满足从小生成定理得出的稳定性条件。尽管明确包括稳定性条件，但由于其严格的稳定性条件，第一种方法可能会在神经网络控制器上提供不足的性能。为了克服这一困难，提出了第二个困难，这可以提高性能，同时确保具有较大吸引力区域的局部稳定性。在第二种方法中，通过在训练神经网络控制器后解决线性基质不等式来确保稳定性。本文中提出的光谱归一化通过构建较紧的本地扇区来提高A-tosterii稳定性测试的可行性。数值实验表明，第二种方法与第一个方法相比提供了足够的性能，同时与现有的增强学习算法相比，确保足够的稳定性。

In this article, two types of methods from different perspectives based on spectral normalization are described for ensuring the stability of the system controlled by a neural network. The first one is that the L2 gain of the feedback system is bounded less than 1 to satisfy the stability condition derived from the small-gain theorem. While explicitly including the stability condition, the first method may provide an insufficient performance on the neural network controller due to its strict stability condition. To overcome this difficulty, the second one is proposed, which improves the performance while ensuring the local stability with a larger region of attraction. In the second method, the stability is ensured by solving linear matrix inequalities after training the neural network controller. The spectral normalization proposed in this article improves the feasibility of the a-posteriori stability test by constructing tighter local sectors. The numerical experiments show that the second method provides enough performance compared with the first one while ensuring enough stability compared with the existing reinforcement learning algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题