通用游戏的空间国家行动功能

论文标题

通用游戏的空间国家行动功能

Spatial State-Action Features for General Games

论文作者

Soemers, Dennis J. N. J., Piette, Éric, Stephenson, Matthew, Browne, Cameron

论文摘要

在许多棋盘游戏和其他抽象游戏中，模式已被用作可以指导自动化游戏代理的功能。这些模式或功能通常代表碎片，空位置等的特定配置，这可能与游戏策略有关。它们在GO游戏中尤为普遍，但也将许多其他游戏用作AI研究的基准。在本文中，我们为一般游戏制定了空间状态功能的设计和有效实现。这些模式可以训练，以激励或拒绝行动，以基于它们是否匹配在动作变量周围地区的州变量。我们提供了有关几种设计和实施选择的广泛细节，主要关注高度的一般性，以使用不同的板几何或其他图表来支持各种不同的游戏。其次，我们提出了一种评估任何给定特征的活动特征的有效方法。在这种方法中，我们从诸如SAT之类的问题中使用的启发式方法中汲取灵感，以优化与部分相匹配的部分和修剪不必要的评估的顺序。该方法是针对该问题的高度通用和抽象的描述定义的 - 评估了以脱节性正常形式的公式的命题优化的顺序 - 因此，除了棋盘游戏以外，其他类型的问题也可能引起其他类型的问题。对Ludii一般游戏系统中33场不同游戏的经验评估证明了与天真基线相比，这种方法的效率以及基于前缀树的基线，并证明，额外的效率可显着提高使用该功能来指导搜索的特征。

In many board games and other abstract games, patterns have been used as features that can guide automated game-playing agents. Such patterns or features often represent particular configurations of pieces, empty positions, etc., which may be relevant for a game's strategies. Their use has been particularly prevalent in the game of Go, but also many other games used as benchmarks for AI research. In this paper, we formulate a design and efficient implementation of spatial state-action features for general games. These are patterns that can be trained to incentivise or disincentivise actions based on whether or not they match variables of the state in a local area around action variables. We provide extensive details on several design and implementation choices, with a primary focus on achieving a high degree of generality to support a wide variety of different games using different board geometries or other graphs. Secondly, we propose an efficient approach for evaluating active features for any given set of features. In this approach, we take inspiration from heuristics used in problems such as SAT to optimise the order in which parts of patterns are matched and prune unnecessary evaluations. This approach is defined for a highly general and abstract description of the problem -- phrased as optimising the order in which propositions of formulas in disjunctive normal form are evaluated -- and may therefore also be of interest to other types of problems than board games. An empirical evaluation on 33 distinct games in the Ludii general game system demonstrates the efficiency of this approach in comparison to a naive baseline, as well as a baseline based on prefix trees, and demonstrates that the additional efficiency significantly improves the playing strength of agents using the features to guide search.

下载PDF全文

下载文献需遵守相关版权规定

论文标题