论文标题
我可以在这里做什么?强化学习的理论
What can I do here? A Theory of Affordances in Reinforcement Learning
论文作者
论文摘要
加强学习算法通常假定所有动作始终可用于代理商。但是,人和动物都了解其环境特征与可行的行动之间的一般联系。吉布森(Gibson,1977)创造了“负担”一词,以描述某些国家使代理在体现的代理人中能够采取某些行动的事实。在本文中,我们为在马尔可夫决策过程中学习和计划的代理商提供了一种负担能力理论。在这种情况下,Profises扮演了双重角色。一方面,它们可以通过减少任何给定情况下可用的动作数量来更快地计划。另一方面,它们促进了从数据中更有效,更精确的过渡模型学习,尤其是当此类模型需要函数近似时。我们通过理论结果以及说明性示例来建立这些属性。我们还提出了一种学习能力的方法,并使用它来估计更简单和概括的过渡模型。
Reinforcement learning algorithms usually assume that all actions are always available to an agent. However, both people and animals understand the general link between the features of their environment and the actions that are feasible. Gibson (1977) coined the term "affordances" to describe the fact that certain states enable an agent to do certain actions, in the context of embodied agents. In this paper, we develop a theory of affordances for agents who learn and plan in Markov Decision Processes. Affordances play a dual role in this case. On one hand, they allow faster planning, by reducing the number of actions available in any given situation. On the other hand, they facilitate more efficient and precise learning of transition models from data, especially when such models require function approximation. We establish these properties through theoretical results as well as illustrative examples. We also propose an approach to learn affordances and use it to estimate transition models that are simpler and generalize better.