论文标题

选择性粒子的注意:深入增强学习中的基于视觉特征的注意力

Selective Particle Attention: Visual Feature-Based Attention in Deep Reinforcement Learning

论文作者

Blakeman, Sam, Mareschal, Denis

论文摘要

人脑对过滤感知输入使用选择性注意力,因此只有使用其有限的计算资源来处理对行为有用的组件。我们专注于一种特定形式的视觉注意力,称为基于特征的注意力,这与识别视觉输入的特征有关,这些特征对于当前任务至关重要,无论其空间位置如何。已经提出了基于视觉特征的关注,以通过降低国家表示的维度并指导对相关特征的学习来提高增强学习效率(RL)。尽管在复杂的感知运动任务中达到了人类水平的表现,但深度RL算法一直因其效率差和缺乏灵活性而受到批评。因此,基于视觉特征的注意力代表了解决这些批评的一种选择。然而,这仍然是一个悬而未决的问题,大脑如何能够学习在RL期间要参与的功能。为了帮助回答这个问题,我们提出了一种新型算法,称为选择性粒子注意力(SPA),该算法具有深入的RL代理,能够执行基于选择性特征的注意力。水疗中心了解哪些功能的组合是根据其自下而上的显着性以及他们预测未来奖励的准确性。我们在多项选择任务和2D视频游戏中评估水疗中心,该游戏均涉及原始像素输入和对任务结构的动态更改。我们展示了水疗中心的各种好处,而这些方法是天真地关注全部或随机特征子集的方法。我们的结果表明(1)深度RL模型中的基于视觉特征的注意力如何提高其学习效率和处理任务结构突然变化的能力,以及(2)粒子过滤器可能代表一个可行的计算说明,说明大脑中基于视觉特征的注意力如何发生。

The human brain uses selective attention to filter perceptual input so that only the components that are useful for behaviour are processed using its limited computational resources. We focus on one particular form of visual attention known as feature-based attention, which is concerned with identifying features of the visual input that are important for the current task regardless of their spatial location. Visual feature-based attention has been proposed to improve the efficiency of Reinforcement Learning (RL) by reducing the dimensionality of state representations and guiding learning towards relevant features. Despite achieving human level performance in complex perceptual-motor tasks, Deep RL algorithms have been consistently criticised for their poor efficiency and lack of flexibility. Visual feature-based attention therefore represents one option for addressing these criticisms. Nevertheless, it is still an open question how the brain is able to learn which features to attend to during RL. To help answer this question we propose a novel algorithm, termed Selective Particle Attention (SPA), which imbues a Deep RL agent with the ability to perform selective feature-based attention. SPA learns which combinations of features to attend to based on their bottom-up saliency and how accurately they predict future reward. We evaluate SPA on a multiple choice task and a 2D video game that both involve raw pixel input and dynamic changes to the task structure. We show various benefits of SPA over approaches that naively attend to either all or random subsets of features. Our results demonstrate (1) how visual feature-based attention in Deep RL models can improve their learning efficiency and ability to deal with sudden changes in task structure and (2) that particle filters may represent a viable computational account of how visual feature-based attention occurs in the brain.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源