在潜在的平均现场游戏中学习最佳政策：平滑政策迭代算法

论文标题

在潜在的平均现场游戏中学习最佳政策：平滑政策迭代算法

Learning optimal policies in potential Mean Field Games: Smoothed Policy Iteration algorithms

论文作者

Tang, Qing, Song, Jiahao

论文摘要

我们介绍了两种平滑的策略迭代算法（\ textbf {spi}）作为学习策略的规则和计算二阶潜在平均野战游戏（MFGS）的NASH平衡方法的方法。如果MFG系统中的耦合项满足Lasry Lions单调性条件，则可以证明全局收敛。对于可能具有多个解决方案的系统证明了局部收敛到稳定的解决方案。收敛分析显示了\ textbf {spi} s和虚拟播放算法之间的密切联系，该算法已在MFG文献中广泛研究。提出了基于有限差异方案的数值模拟结果，以补充理论分析。

We introduce two Smoothed Policy Iteration algorithms (\textbf{SPI}s) as rules for learning policies and methods for computing Nash equilibria in second order potential Mean Field Games (MFGs). Global convergence is proved if the coupling term in the MFG system satisfy the Lasry Lions monotonicity condition. Local convergence to a stable solution is proved for system which may have multiple solutions. The convergence analysis shows close connections between \textbf{SPI}s and the Fictitious Play algorithm, which has been widely studied in the MFG literature. Numerical simulation results based on finite difference schemes are presented to supplement the theoretical analysis.

下载PDF全文

下载文献需遵守相关版权规定

论文标题