分布式在线线性二次控制，用于线性时间不变系统

论文标题

分布式在线线性二次控制，用于线性时间不变系统

Distributed Online Linear Quadratic Control for Linear Time-invariant Systems

论文作者

Chang, Ting-Jui, Shahrampour, Shahin

论文摘要

经典线性二次控制（LQ）控制中心围绕线性时间不变（LTI）系统，其中控制状态对引入了具有时间不变参数的二次成本。在线优化和控制方面的最新进步提供了研究的新工具来研究LQ问题，这些问题对于随时间变化的成本参数是可靠的。受这一研究的启发，我们研究了相同LTI系统的分布式在线LQ问题。考虑一个多代理网络，其中每个代理都被建模为LTI系统。 LTI系统与依次揭示的分离，时变的二次成本相关。该网络的目的是使所有代理人的控制顺序与遗憾的概念所捕捉到，与最佳集中式政策竞争。我们开发了在线LQ算法的分布式变体，该变体通过投影到半定义编程（SDP）来生成控制器，该版本运行在线梯度下降。我们将遗憾的缩放缩放为有限的时间莫森的平方根，这意味着代理人随着时间的流逝达成共识。我们进一步提供了验证我们理论结果的数值实验。

Classical linear quadratic (LQ) control centers around linear time-invariant (LTI) systems, where the control-state pairs introduce a quadratic cost with time-invariant parameters. Recent advancement in online optimization and control has provided novel tools to study LQ problems that are robust to time-varying cost parameters. Inspired by this line of research, we study the distributed online LQ problem for identical LTI systems. Consider a multi-agent network where each agent is modeled as an LTI system. The LTI systems are associated with decoupled, time-varying quadratic costs that are revealed sequentially. The goal of the network is to make the control sequence of all agents competitive to that of the best centralized policy in hindsight, captured by the notion of regret. We develop a distributed variant of the online LQ algorithm, which runs distributed online gradient descent with a projection to a semi-definite programming (SDP) to generate controllers. We establish a regret bound scaling as the square root of the finite time-horizon, implying that agents reach consensus as time grows. We further provide numerical experiments verifying our theoretical result.

下载PDF全文

下载文献需遵守相关版权规定

论文标题