与预测有关：迈向与任务无关的知识表示以进行强化学习

论文标题

与预测有关：迈向与任务无关的知识表示以进行强化学习

Relate to Predict: Towards Task-Independent Knowledge Representations for Reinforcement Learning

论文作者

Schnürer, Thomas, Probst, Malte, Gross, Horst-Michael

论文摘要

强化学习（RL）可以使代理商学习复杂的任务。但是，很难解释知识并在任务中重复使用知识。归纳偏见可以通过明确提供通用但有用的分解来解决此类问题，而隐式学习本来是困难或昂贵的。例如，以对象为中心的方法将高维观察分解为单个对象。为此，我们利用归纳偏见来实现以对象为中心的知识分离，从而将进一步分解为语义表示和动态知识。为此，我们介绍了一个语义模块，该模块根据其上下文预测对象的语义状态。然后可以使用产生的类似于负担的对象状态来丰富感知对象表示。凭借最少的设置和能够实现拼图式任务的环境，我们证明了这种方法的可行性和好处。具体而言，我们比较将语义表示形式集成到基于模型的RL体系结构中的三种不同方法。我们的实验表明，知识分离中的显性程度与更快的学习，更好的准确性，更好的概括和更好的解释性相关。

Reinforcement Learning (RL) can enable agents to learn complex tasks. However, it is difficult to interpret the knowledge and reuse it across tasks. Inductive biases can address such issues by explicitly providing generic yet useful decomposition that is otherwise difficult or expensive to learn implicitly. For example, object-centered approaches decompose a high dimensional observation into individual objects. Expanding on this, we utilize an inductive bias for explicit object-centered knowledge separation that provides further decomposition into semantic representations and dynamics knowledge. For this, we introduce a semantic module that predicts an objects' semantic state based on its context. The resulting affordance-like object state can then be used to enrich perceptual object representations. With a minimal setup and an environment that enables puzzle-like tasks, we demonstrate the feasibility and benefits of this approach. Specifically, we compare three different methods of integrating semantic representations into a model-based RL architecture. Our experiments show that the degree of explicitness in knowledge separation correlates with faster learning, better accuracy, better generalization, and better interpretability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题