使用单数扰动近似值减少维度增强学习控制

论文标题

使用单数扰动近似值减少维度增强学习控制

Reduced-Dimensional Reinforcement Learning Control using Singular Perturbation Approximations

论文作者

Mukherjee, Sayak, Bai, He, Chakrabortty, Aranya

论文摘要

我们提出了一组无模型的，降低的维度增强学习（RL）的最佳控制设计，用于线性时间不变的奇异扰动（SP）系统。我们首先为具有未知状态和输入矩阵的通用SP系统提供了基于状态反馈和输出反馈的RL控制设计。我们利用工厂的基本时间尺度分离属性来学习线性二次调节器（LQR）仅用于其缓慢的动力学，从而与常规的全维RL控制器相比节省了大量的学习时间。我们使用SP近似定理分析了设计的子优先性，并为闭环稳定性提供了足够的条件。此后，我们将这两种设计扩展到聚类的多代理共识网络，SP属性通过聚类反映。我们在缩小的维度上开发了用于此类网络的集中式和集群区域分区的RL控制器。我们使用相关数值示例的模拟来证明这些控制器实现这些控制器的细节，并将它们与常规RL设计进行比较，以显示我们方法的计算益处。

We present a set of model-free, reduced-dimensional reinforcement learning (RL) based optimal control designs for linear time-invariant singularly perturbed (SP) systems. We first present a state-feedback and output-feedback based RL control design for a generic SP system with unknown state and input matrices. We take advantage of the underlying time-scale separation property of the plant to learn a linear quadratic regulator (LQR) for only its slow dynamics, thereby saving a significant amount of learning time compared to the conventional full-dimensional RL controller. We analyze the sub-optimality of the design using SP approximation theorems and provide sufficient conditions for closed-loop stability. Thereafter, we extend both designs to clustered multi-agent consensus networks, where the SP property reflects through clustering. We develop both centralized and cluster-wise block-decentralized RL controllers for such networks, in reduced dimensions. We demonstrate the details of the implementation of these controllers using simulations of relevant numerical examples and compare them with conventional RL designs to show the computational benefits of our approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题