使用蒙特卡洛方法的深活动推理剂

论文标题

使用蒙特卡洛方法的深活动推理剂

Deep active inference agents using Monte-Carlo methods

论文作者

Fountas, Zafeirios, Sajid, Noor, Mediano, Pedro A. M., Friston, Karl

论文摘要

主动推断是理解生物智能的贝叶斯框架。潜在的理论将一个单一命令的感知和行动汇集在一起：最小化自由能。但是，尽管其在解释智能方面具有理论上的实用性，但计算实施仍仅限于低维和理想化的情况。在本文中，我们提出了一种神经体系结构，用于在复杂的连续状态空间中使用多种形式的蒙特卡洛（MC）采样来构建深层活跃的推理剂。为此，我们介绍了许多新颖的技术，以进行活跃推断。其中包括：i）通过MC搜索选择自由能 - 最佳策略，ii）通过馈送前进的“惯性”网络近似此最佳策略分布，iii）使用MC掉落来预测未来的参数信念更新，最后，iv）优化状态过渡精度（注意的高端注意力形式）。我们的方法使代理商能够在基于奖励的同行方面有效地学习环境动态，同时保持任务绩效。我们基于DSPRITES数据集在新的玩具环境中进行了说明，并证明主动推理代理会自动创建用于建模状态过渡的分离表示的表示。在更复杂的动物环境中，我们的代理商（使用相同的神经结构）能够模拟未来的状态过渡和行动（即计划），以证明奖励定向导航 - 尽管暂时停止了视觉输入。这些结果表明，配备MC方法的深度主动推理提供了一个灵活的框架，以开发具有生物学灵感的智能代理，并在机器学习和认知科学中应用。

Active inference is a Bayesian framework for understanding biological intelligence. The underlying theory brings together perception and action under one single imperative: minimizing free energy. However, despite its theoretical utility in explaining intelligence, computational implementations have been restricted to low-dimensional and idealized situations. In this paper, we present a neural architecture for building deep active inference agents operating in complex, continuous state-spaces using multiple forms of Monte-Carlo (MC) sampling. For this, we introduce a number of techniques, novel to active inference. These include: i) selecting free-energy-optimal policies via MC tree search, ii) approximating this optimal policy distribution via a feed-forward `habitual' network, iii) predicting future parameter belief updates using MC dropouts and, finally, iv) optimizing state transition precision (a high-end form of attention). Our approach enables agents to learn environmental dynamics efficiently, while maintaining task performance, in relation to reward-based counterparts. We illustrate this in a new toy environment, based on the dSprites data-set, and demonstrate that active inference agents automatically create disentangled representations that are apt for modeling state transitions. In a more complex Animal-AI environment, our agents (using the same neural architecture) are able to simulate future state transitions and actions (i.e., plan), to evince reward-directed navigation - despite temporary suspension of visual input. These results show that deep active inference - equipped with MC methods - provides a flexible framework to develop biologically-inspired intelligent agents, with applications in both machine learning and cognitive science.

下载PDF全文

下载文献需遵守相关版权规定

论文标题