论文标题
蒙特卡洛树搜索中的lookahead病理
Lookahead Pathology in Monte-Carlo Tree Search
论文作者
论文摘要
蒙特卡洛树搜索(MCT)是一种搜索范式,它首先在计算机域中成功地发现了其成功。早期的理论工作确定了应用于树木(UCT)的上限范围的健全性和收敛范围,这是MCT的最流行的实例化;但是,我们对UCT在实践中的行为的理解仍然存在着显着的差距。在这项工作中,我们通过考虑UCT是否可以在对抗性环境中表现出lookahead病理的问题来解决一个差距,这是一种在Minimax搜索中首先观察到的一种矛盾的现象,更大的搜索工作会导致决策更糟。我们介绍了一个新颖的合成游戏系列,可提供丰富的建模可能性,同时还可以接受数学分析。我们的理论和实验结果表明,UCT确实易受这个家庭中一系列游戏的病理行为的影响。
Monte-Carlo Tree Search (MCTS) is a search paradigm that first found prominence with its success in the domain of computer Go. Early theoretical work established the soundness and convergence bounds for Upper Confidence bounds applied to Trees (UCT), the most popular instantiation of MCTS; however, there remain notable gaps in our understanding of how UCT behaves in practice. In this work, we address one such gap by considering the question of whether UCT can exhibit lookahead pathology in adversarial settings -- a paradoxical phenomenon first observed in Minimax search where greater search effort leads to worse decision-making. We introduce a novel family of synthetic games that offer rich modeling possibilities while remaining amenable to mathematical analysis. Our theoretical and experimental results suggest that UCT is indeed susceptible to pathological behavior in a range of games drawn from this family.