通过二次成本取消基于梯度的多学位学习的干扰解耦

论文标题

通过二次成本取消基于梯度的多学位学习的干扰解耦

Disturbance Decoupling for Gradient-based Multi-Agent Learning with Quadratic Costs

论文作者

Li, Sarah H. Q., Ratliff, Lillian, Açıkmeşe, Behçet

论文摘要

本文在嘈杂的环境中的多机构学习的应用中进行了激励，研究了基于梯度的学习动力学在干扰方面的鲁棒性。尽管沿与任何玩家的行为相对应的坐标注入的干扰总是会影响整体学习动态，但一部分玩家可能会被扰动解耦，即。我们提供了必要和足够的条件，以保证具有二次成本功能的游戏的此属性，其中涵盖了二次的一击连续游戏，有限的摩托 - 摩托线线性二次（LQ）动态游戏和双线性游戏。具体而言，干扰解耦的特征是学习动力学上的代数和图理论条件，后者是通过基于玩家成本梯度构建游戏图来获得的。对于LQ游戏，我们表明，干扰脱钩对玩家的可控和不可观察的子空间施加了约束。对于两个播放器双线游戏，我们表明在玩家的动作坐标中取消干扰对收益矩阵施加限制。提供了说明性的数值示例。

Motivated by applications of multi-agent learning in noisy environments, this paper studies the robustness of gradient-based learning dynamics with respect to disturbances. While disturbances injected along a coordinate corresponding to any individual player's actions can always affect the overall learning dynamics, a subset of players can be disturbance decoupled---i.e., such players' actions are completely unaffected by the injected disturbance. We provide necessary and sufficient conditions to guarantee this property for games with quadratic cost functions, which encompass quadratic one-shot continuous games, finite-horizon linear quadratic (LQ) dynamic games, and bilinear games. Specifically, disturbance decoupling is characterized by both algebraic and graph-theoretic conditions on the learning dynamics, the latter is obtained by constructing a game graph based on gradients of players' costs. For LQ games, we show that disturbance decoupling imposes constraints on the controllable and unobservable subspaces of players. For two player bilinear games, we show that disturbance decoupling within a player's action coordinates imposes constraints on the payoff matrices. Illustrative numerical examples are provided.

下载PDF全文

下载文献需遵守相关版权规定

论文标题