关于随机游戏的全球虚拟游戏的全球融合，具有基于回合的控制器

论文标题

关于随机游戏的全球虚拟游戏的全球融合，具有基于回合的控制器

On the Global Convergence of Stochastic Fictitious Play in Stochastic Games with Turn-based Controllers

论文作者

Sayin, Muhammed O.

论文摘要

本文介绍了一种学习动态，只要阶段付款诱导零和相同的利益游戏，就可以使用基于转弯的控制器（在状态过渡）的任何随机游戏（在状态过渡上）进行融合保证。不同状态的阶段付款甚至可以具有不同的结构，例如，在某些州将其求和到零，而在其他州则相同。提出的动态将古典随机虚拟游戏与随机游戏的价值迭代结合在一起。有两个关键的特性：（i）玩家玩有限的地平线随机游戏，其基础无限马在随机游戏中的长度增加，并且（ii）基于回合的控制器确保辅助阶段游戏（估计的持续收益估计）在战略上等同于零含量或相同的固有游戏。

This paper presents a learning dynamic with almost sure convergence guarantee for any stochastic game with turn-based controllers (on state transitions) as long as stage-payoffs induce a zero-sum or identical-interest game. Stage-payoffs for different states can even have different structures, e.g., by summing to zero in some states and being identical in others. The dynamics presented combines the classical stochastic fictitious play with value iteration for stochastic games. There are two key properties: (i) players play finite horizon stochastic games with increasing lengths within the underlying infinite-horizon stochastic game, and (ii) the turn-based controllers ensure that the auxiliary stage-games (induced from the continuation payoff estimated) are strategically equivalent to zero-sum or identical-interest games.

下载PDF全文

下载文献需遵守相关版权规定

论文标题