一个有限容量的最小值定理，用于非凸线游戏，或者：我如何学会不再担心混合核心和爱神经网

论文标题

一个有限容量的最小值定理，用于非凸线游戏，或者：我如何学会不再担心混合核心和爱神经网

A Limited-Capacity Minimax Theorem for Non-Convex Games or: How I Learned to Stop Worrying about Mixed-Nash and Love Neural Nets

论文作者

Gidel, Gauthier, Balduzzi, David, Czarnecki, Wojciech Marian, Garnelo, Marta, Bachrach, Yoram

论文摘要

对抗性训练是一种多目标优化的特殊情况，是一种日益普遍的机器学习技术：其一些最著名的应用程序包括基于GAN的生成建模和增强学习中的自我播放技术，这些技术已应用于Go或Poker等复杂游戏。实际上，通常对一对网络进行\ emph {单个}对，以找到高度非concave-nonconvex对抗性问题的近似平衡。但是，尽管游戏理论的经典结果指出了凹入式游戏中存在这种均衡，但如果收益是非concave-nonconvex，则不能保证。我们的主要贡献是为大型游戏提供大概的Minimax定理，玩家选择包括Wgan，Starcraft II和Blotto游戏在内的神经网络。我们的发现取决于以下事实：尽管相对于神经网络参数是非concave-nonconvex，但这些游戏还是相对于这些神经网络代表的实际模型（例如函数或分布）的凹形符号。

Adversarial training, a special case of multi-objective optimization, is an increasingly prevalent machine learning technique: some of its most notable applications include GAN-based generative modeling and self-play techniques in reinforcement learning which have been applied to complex games such as Go or Poker. In practice, a \emph{single} pair of networks is typically trained in order to find an approximate equilibrium of a highly nonconcave-nonconvex adversarial problem. However, while a classic result in game theory states such an equilibrium exists in concave-convex games, there is no analogous guarantee if the payoff is nonconcave-nonconvex. Our main contribution is to provide an approximate minimax theorem for a large class of games where the players pick neural networks including WGAN, StarCraft II, and Blotto Game. Our findings rely on the fact that despite being nonconcave-nonconvex with respect to the neural networks parameters, these games are concave-convex with respect to the actual models (e.g., functions or distributions) represented by these neural networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题