论文标题
混乱,极端主义和乐观:游戏中的学习音量分析
Chaos, Extremism and Optimism: Volume Analysis of Learning in Games
论文作者
论文摘要
我们介绍了乘以零和协调游戏中乘法权重更新(MWU)和乐观的乘法更新(OMWU)的体积分析。这样的分析为这些游戏动力系统提供了新的见解,这些系统似乎很难通过计算机科学和机器学习中的经典技术实现。 第一步是检查这些动态不是在其原始空间中(动作的简单),而是在双重空间(动作的总回报空间)。第二步是探索一组初始条件的体积如何随着时间的推移而根据算法向前推动。这让人联想到进化游戏理论中的方法,其中复制器动力学(MWU的连续时间模拟)始终在所有游戏中保留体积。有趣的是,当我们检查离散时间动态时,游戏的选择和算法的选择都起着至关重要的作用。因此,尽管MWU在零和游戏中扩大了数量,因此是Lyapunov混乱的,但我们表明OMWU合同的数量为其已知的收敛行为提供了另一种理解。但是,我们也证明了一种不休息的定理,从某种意义上说,在检查协调游戏时,角色会逆转:OMWU呈指数速度扩展,而MWU合同。 使用这些工具,我们证明了MWU在零和游戏中的两个小说,相当负面的特性:(1)极端主义:即使在具有独特的完全混合纳什平衡的游戏中,该系统仍会在纯粹策略旁边依次粘贴在纯粹策略的概况上,尽管它们从游戏理论的角度显然是不稳定的。 (2)不可避免的性:如果有任何一组好点(用您自己对“良好”的解释),系统无法无限期地避免不良点。
We present volume analyses of Multiplicative Weights Updates (MWU) and Optimistic Multiplicative Weights Updates (OMWU) in zero-sum as well as coordination games. Such analyses provide new insights into these game dynamical systems, which seem hard to achieve via the classical techniques within Computer Science and Machine Learning. The first step is to examine these dynamics not in their original space (simplex of actions) but in a dual space (aggregate payoff space of actions). The second step is to explore how the volume of a set of initial conditions evolves over time when it is pushed forward according to the algorithm. This is reminiscent of approaches in Evolutionary Game Theory where replicator dynamics, the continuous-time analogue of MWU, is known to always preserve volume in all games. Interestingly, when we examine discrete-time dynamics, both the choice of the game and the choice of the algorithm play a critical role. So whereas MWU expands volume in zero-sum games and is thus Lyapunov chaotic, we show that OMWU contracts volume, providing an alternative understanding for its known convergent behavior. However, we also prove a no-free-lunch type of theorem, in the sense that when examining coordination games the roles are reversed: OMWU expands volume exponentially fast, whereas MWU contracts. Using these tools, we prove two novel, rather negative properties of MWU in zero-sum games: (1) Extremism: even in games with unique fully mixed Nash equilibrium, the system recurrently gets stuck near pure-strategy profiles, despite them being clearly unstable from game theoretic perspective. (2) Unavoidability: given any set of good points (with your own interpretation of "good"), the system cannot avoid bad points indefinitely.