论文标题

QTRAN ++:改善合作多代理增强学习的价值转换

QTRAN++: Improved Value Transformation for Cooperative Multi-Agent Reinforcement Learning

论文作者

Son, Kyunghwan, Ahn, Sungsoo, Reyes, Roben Delos, Shin, Jinwoo, Yi, Yung

论文摘要

QTRAN是一种多代理增强学习(MARL)算法,能够最新学习最大的联合行动价值功能。然而,尽管具有强大的理论保证,但在复杂环境中的经验表现不佳,例如星际争霸多代理挑战(SMAC)。在本文中,我们确定了QTRAN的性能瓶颈,并提出了一个大大改进的版本,即创建的Qtran ++。我们的收益来自(i)稳定QTRAN的训练目标,(ii)删除QTRAN的动作值估计器之间的严格角色分离,以及(iii)引入多头混合网络以进行价值转换。通过广泛的评估,我们确认我们的诊断是正确的,QTRAN ++成功地弥合了经验绩效和理论保证之间的差距。特别是,Qtran ++在SMAC环境中新实现最先进的性能。代码将发布。

QTRAN is a multi-agent reinforcement learning (MARL) algorithm capable of learning the largest class of joint-action value functions up to date. However, despite its strong theoretical guarantee, it has shown poor empirical performance in complex environments, such as Starcraft Multi-Agent Challenge (SMAC). In this paper, we identify the performance bottleneck of QTRAN and propose a substantially improved version, coined QTRAN++. Our gains come from (i) stabilizing the training objective of QTRAN, (ii) removing the strict role separation between the action-value estimators of QTRAN, and (iii) introducing a multi-head mixing network for value transformation. Through extensive evaluation, we confirm that our diagnosis is correct, and QTRAN++ successfully bridges the gap between empirical performance and theoretical guarantee. In particular, QTRAN++ newly achieves state-of-the-art performance in the SMAC environment. The code will be released.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源