论文标题
如何避免被刺激吞噬:文本冒险代理的探索策略
How To Avoid Being Eaten By a Grue: Exploration Strategies for Text-Adventure Agents
论文作者
论文摘要
基于文本的游戏 - 代理通过文本自然语言与世界互动 - 向我们介绍了组合大小的动作空间的问题。当前的大多数强化学习算法都无法有效处理如此大量的可能动作。因此,样本效率较差,导致无法通过瓶颈状态的代理,因为它们没有看到正确的动作序列,无法通过瓶颈足够的时间来充分加强瓶颈。我们在增强学习中使用知识图的先前工作,我们介绍了两种新的游戏状态探索策略。我们将我们的探索策略与经典文本冒险游戏Zork1上的强大基线进行比较,在那里,先前的经纪人无法超越瓶颈,在该瓶颈上被刺激。
Text-based games -- in which an agent interacts with the world through textual natural language -- present us with the problem of combinatorially-sized action-spaces. Most current reinforcement learning algorithms are not capable of effectively handling such a large number of possible actions per turn. Poor sample efficiency, consequently, results in agents that are unable to pass bottleneck states, where they are unable to proceed because they do not see the right action sequence to pass the bottleneck enough times to be sufficiently reinforced. Building on prior work using knowledge graphs in reinforcement learning, we introduce two new game state exploration strategies. We compare our exploration strategies against strong baselines on the classic text-adventure game, Zork1, where prior agent have been unable to get past a bottleneck where the agent is eaten by a Grue.