论文标题
随机策略中的不同中风:根据有限记忆假设重新访问库恩的定理
Different Strokes in Randomised Strategies: Revisiting Kuhn's Theorem under Finite-Memory Assumptions
论文作者
论文摘要
(可能是随机)图上的两种玩家(拮抗)游戏是理论计算机科学中的普遍模型,特别是作为反应性合成的框架。 最佳策略在处理固有的概率目标,平衡多个目标或部分信息的情况下可能需要随机化。没有独特的方法来定义随机策略。例如,人们可以使用所谓的混合策略或行为方面的策略。在最一般的环境中,这两个类别不具有相同的表现力。游戏理论的开创性结果 - 库恩的定理 - 在完美回忆的游戏中宣称了它们的等价性。 这一结果至关重要地依赖于使用无限记忆的策略的可能性,即对过去所有观察的无限知识。但是,在实践中,计算机系统是有限的。因此,有必要将我们的注意力限制在有限的内存策略上,该策略定义为带有输出的自动机。随机化可以以不同的方式实现:初始化,输出或过渡可以分别是随机或确定性的。取决于哪些方面是随机的,相应的有限记忆策略类别的表现力有所不同。 在这项工作中,我们研究了两人同时发生的随机游戏,并提供了通过改变上述三个组件中的哪个是随机分组的,可以对有限内存策略的类别进行完整的分类。我们的分类法在完美和不完美的信息中具有完美召回的游戏,以及有两个以上玩家的游戏。我们还为不完美的召回游戏提供了适合的分类学。
Two-player (antagonistic) games on (possibly stochastic) graphs are a prevalent model in theoretical computer science, notably as a framework for reactive synthesis. Optimal strategies may require randomisation when dealing with inherently probabilistic goals, balancing multiple objectives, or in contexts of partial information. There is no unique way to define randomised strategies. For instance, one can use so-called mixed strategies or behavioural ones. In the most general setting, these two classes do not share the same expressiveness. A seminal result in game theory -- Kuhn's theorem -- asserts their equivalence in games of perfect recall. This result crucially relies on the possibility for strategies to use infinite memory, i.e., unlimited knowledge of all past observations. However, computer systems are finite in practice. Hence it is pertinent to restrict our attention to finite-memory strategies, defined as automata with outputs. Randomisation can be implemented in these in different ways: the initialisation, outputs or transitions can be randomised or deterministic respectively. Depending on which aspects are randomised, the expressiveness of the corresponding class of finite-memory strategies differs. In this work, we study two-player concurrent stochastic games and provide a complete taxonomy of the classes of finite-memory strategies obtained by varying which of the three aforementioned components are randomised. Our taxonomy holds in games of perfect and imperfect information with perfect recall, and in games with more than two players. We also provide an adapted taxonomy for games with imperfect recall.