用于网络系统控制的多代理增强学习

论文标题

用于网络系统控制的多代理增强学习

Multi-agent Reinforcement Learning for Networked System Control

论文作者

Chu, Tianshu, Chinchali, Sandeep, Katti, Sachin

论文摘要

本文考虑了网络系统控制中的多代理增强学习（MARL）。具体而言，每个代理商都根据互联邻居的本地观察结果和消息来学习分散的控制策略。我们将这种网络MAL（NMARL）问题提出为时空马尔可夫决策过程，并引入了空间折现因子，以稳定每个本地代理的训练。此外，我们提出了一种称为NeurComm的新可区分的通信协议，以减少NMARL的信息丢失和非平稳性。基于自适应交通信号控制和合作自适应巡航控制的现实NMARL方案的实验，适当的空间折现因素有效地增强了非交流性MARL算法的学习曲线，而神经Commer在学习效率和控制效果方面的实验均优于现有的通信协议。

This paper considers multi-agent reinforcement learning (MARL) in networked system control. Specifically, each agent learns a decentralized control policy based on local observations and messages from connected neighbors. We formulate such a networked MARL (NMARL) problem as a spatiotemporal Markov decision process and introduce a spatial discount factor to stabilize the training of each local agent. Further, we propose a new differentiable communication protocol, called NeurComm, to reduce information loss and non-stationarity in NMARL. Based on experiments in realistic NMARL scenarios of adaptive traffic signal control and cooperative adaptive cruise control, an appropriate spatial discount factor effectively enhances the learning curves of non-communicative MARL algorithms, while NeurComm outperforms existing communication protocols in both learning efficiency and control performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题