部分观察和不可观察的任务切换的组合模型：通过转移学习统计和动态地解决层次结构的学习问题

论文标题

部分观察和不可观察的任务切换的组合模型：通过转移学习统计和动态地解决层次结构的学习问题

Combined Model for Partially-Observable and Non-Observable Task Switching: Solving Hierarchical Reinforcement Learning Problems Statically and Dynamically with Transfer Learning

论文作者

Khan, Nibraas, Phillips, Joshua

论文摘要

完全自主的机器人和人类的积分功能是能够将注意力集中在一些相关的知觉上，以达到某个目标，同时无视无关的知觉。人类和动物依靠前额叶皮层（PFC）和基底神经节（BG）之间的相互作用来达到称为工作记忆（WM）的焦点。工作记忆工具包（WMTK）是根据该现象的计算神经科学模型开发的，该模型具有时间差异（TD）学习的自主系统。该工具包的最新改编要么利用抽象任务表示（ATRS）来求解不可行的（NO）任务或过去输入功能的存储来求解部分观察到的（PO）任务，但并非两者兼而有之。我们提出了一个新的模型PonowMtk，该模型将两种方法，ATR和输入存储与静态或动态数量的ATR相结合。我们的实验结果表明，Ponowmtk对于表现出PO，NO或两种属性的任务有效执行。

An integral function of fully autonomous robots and humans is the ability to focus attention on a few relevant percepts to reach a certain goal while disregarding irrelevant percepts. Humans and animals rely on the interactions between the Pre-Frontal Cortex (PFC) and the Basal Ganglia (BG) to achieve this focus called Working Memory (WM). The Working Memory Toolkit (WMtk) was developed based on a computational neuroscience model of this phenomenon with Temporal Difference (TD) Learning for autonomous systems. Recent adaptations of the toolkit either utilize Abstract Task Representations (ATRs) to solve Non-Observable (NO) tasks or storage of past input features to solve Partially-Observable (PO) tasks, but not both. We propose a new model, PONOWMtk, which combines both approaches, ATRs and input storage, with a static or dynamic number of ATRs. The results of our experiments show that PONOWMtk performs effectively for tasks that exhibit PO, NO, or both properties.

下载PDF全文

下载文献需遵守相关版权规定

论文标题