使用最大的正确字母概率和强化学习来寻找最佳的人类策略

论文标题

使用最大的正确字母概率和强化学习来寻找最佳的人类策略

Finding the optimal human strategy for Wordle using maximum correct letter probabilities and reinforcement learning

论文作者

Anderson, Benton J., Meyer, Jesse G.

论文摘要

Wordle是一款在线单词益智游戏，在2022年1月获得了病毒的流行。目标是猜测一个隐藏的五个字母单词。每次猜测之后，玩家都会获得有关他们猜测的字母是否存在在单词中以及是否处于正确位置的信息。许多博客建议猜测策略和开始单词列表，以提高获胜的机会。优化的算法可以在六个允许试验中的五场比赛中赢得100％的游戏。但是，由于无法完美回忆所有已知的5个字母单词并执行优化信息增益的复杂计算，因此人类参与者使用这些算法是不可行的。在这里，我们提出了两种不同的方法来选择起始单词，以及一个基于强化学习的最佳人类策略的框架。人类言语玩家可以使用我们发现的规则来优化他们的获胜机会。

Wordle is an online word puzzle game that gained viral popularity in January 2022. The goal is to guess a hidden five letter word. After each guess, the player gains information about whether the letters they guessed are present in the word, and whether they are in the correct position. Numerous blogs have suggested guessing strategies and starting word lists that improve the chance of winning. Optimized algorithms can win 100% of games within five of the six allowed trials. However, it is infeasible for human players to use these algorithms due to an inability to perfectly recall all known 5-letter words and perform complex calculations that optimize information gain. Here, we present two different methods for choosing starting words along with a framework for discovering the optimal human strategy based on reinforcement learning. Human Wordle players can use the rules we discover to optimize their chance of winning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题