喜怒无常的学习者 - 解释强化学习者的竞争行为

论文标题

喜怒无常的学习者 - 解释强化学习者的竞争行为

Moody Learners -- Explaining Competitive Behaviour of Reinforcement Learning Agents

论文作者

Barros, Pablo, Tanevska, Ana, Cruz, Francisco, Sciutti, Alessandra

论文摘要

设计参与竞争互动的人工代理的决策过程是一项艰巨的任务。在竞争性的情况下，代理商不仅具有动态环境，而且直接受到对手行动的影响。但是，观察代理的Q值通常是解释其行为的一种方式，但是，并未显示所选动作之间的时间键。我们通过提出\ emph {Moody Framework}来解决此问题。我们通过使用竞争性多人厨师的帽子卡游戏进行一系列实验来评估我们的模型，并讨论我们的模型如何允许代理商在游戏中获得竞争动态的整体表示。

Designing the decision-making processes of artificial agents that are involved in competitive interactions is a challenging task. In a competitive scenario, the agent does not only have a dynamic environment but also is directly affected by the opponents' actions. Observing the Q-values of the agent is usually a way of explaining its behavior, however, do not show the temporal-relation between the selected actions. We address this problem by proposing the \emph{Moody framework}. We evaluate our model by performing a series of experiments using the competitive multiplayer Chef's Hat card game and discuss how our model allows the agents' to obtain a holistic representation of the competitive dynamics within the game.

下载PDF全文

下载文献需遵守相关版权规定

论文标题