通过分解和分层近似对线性多代理系统的模型最佳控制

论文标题

通过分解和分层近似对线性多代理系统的模型最佳控制

Model-Free Optimal Control of Linear Multi-Agent Systems via Decomposition and Hierarchical Approximation

论文作者

Jing, Gangshan, Bai, He, George, Jemin, Chakrabortty, Aranya

论文摘要

为大型多代理系统（MAS）设计最佳线性二次调节器（LQR），因为它涉及求解大尺寸的矩阵riccati方程。当需要使用诸如增强学习（RL）等方案以无模型的方式进行设计时，情况会进一步激怒。为了降低这种计算复杂性，我们将大规模的LQR设计问题分解为多个较小尺寸的LQR设计问题。我们认为目标函数是通过无向图指定的，并将分解作为图形聚类问题。该图分解为两个部分，一个部分由连接组件的独立簇组成，另一部分包含连接不同簇的边缘。因此，所得控制器具有层次结构，由两个组件组成。第一个组件通过使用RL算法以无模型方式解决较小尺寸的LQR设计问题来优化每个独立群集的性能。第二个组件说明了目标耦合不同簇的目标，这是通过一次镜头解决最小二乘问题来实现的。尽管次优，但层次控制器遵守了目标函数中代理耦合和分解策略所指定的特定结构。建立数学公式以找到一个分解，以最大程度地减少所需通信链接的数量或减少最佳差距。提供数值模拟以突出提出的设计的优缺点。

Designing the optimal linear quadratic regulator (LQR) for a large-scale multi-agent system (MAS) is time-consuming since it involves solving a large-size matrix Riccati equation. The situation is further exasperated when the design needs to be done in a model-free way using schemes such as reinforcement learning (RL). To reduce this computational complexity, we decompose the large-scale LQR design problem into multiple smaller-size LQR design problems. We consider the objective function to be specified over an undirected graph, and cast the decomposition as a graph clustering problem. The graph is decomposed into two parts, one consisting of independent clusters of connected components, and the other containing edges that connect different clusters. Accordingly, the resulting controller has a hierarchical structure, consisting of two components. The first component optimizes the performance of each independent cluster by solving the smaller-size LQR design problem in a model-free way using an RL algorithm. The second component accounts for the objective coupling different clusters, which is achieved by solving a least squares problem in one shot. Although suboptimal, the hierarchical controller adheres to a particular structure as specified by inter-agent couplings in the objective function and by the decomposition strategy. Mathematical formulations are established to find a decomposition that minimizes the number of required communication links or reduces the optimality gap. Numerical simulations are provided to highlight the pros and cons of the proposed designs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题