论文标题

稀疏的图形记忆用于健全计划

Sparse Graphical Memory for Robust Planning

论文作者

Emmons, Scott, Jain, Ajay, Laskin, Michael, Kurutach, Thanard, Abbeel, Pieter, Pathak, Deepak

论文摘要

为了在现实世界中有效运作,代理应该能够从高维的原始感觉输入(例如图像)中采取行动,并在长时间的长时间内实现各种目标。当前的深度强化和模仿学习方法可以直接从高维输入中学习,但不能很好地扩展到长期任务。相比之下,诸如*搜索之类的经典图形方法能够求解长马的任务,但假设状态空间从原始感觉输入中抽象出来。最近的作品试图结合深度学习和古典计划的优势。但是,该域中的主要方法仍然非常脆弱,并且随着环境的规模而言,尺寸较差。我们介绍了稀疏图形内存(SGM),这是一种新的数据结构,将状态存储和可行的过渡存储在稀疏内存中。 SGM汇总根据一个新颖的双向一致性目标,将经典状态聚合标准调整为目标条件的RL:两个状态在作为目标和起始状态都可以互换时是多余的。从理论上讲,我们证明,根据双向一致性合并节点会导致最短路径长度的增加,而路径长度仅与合并阈值线性缩放。在实验上,我们表明SGM在很长的地平线上明显胜过当前的艺术方法,这是稀疏的奖励视觉导航任务。项目视频和代码可从https://mishalaskin.github.io/sgm/获得

To operate effectively in the real world, agents should be able to act from high-dimensional raw sensory input such as images and achieve diverse goals across long time-horizons. Current deep reinforcement and imitation learning methods can learn directly from high-dimensional inputs but do not scale well to long-horizon tasks. In contrast, classical graphical methods like A* search are able to solve long-horizon tasks, but assume that the state space is abstracted away from raw sensory input. Recent works have attempted to combine the strengths of deep learning and classical planning; however, dominant methods in this domain are still quite brittle and scale poorly with the size of the environment. We introduce Sparse Graphical Memory (SGM), a new data structure that stores states and feasible transitions in a sparse memory. SGM aggregates states according to a novel two-way consistency objective, adapting classic state aggregation criteria to goal-conditioned RL: two states are redundant when they are interchangeable both as goals and as starting states. Theoretically, we prove that merging nodes according to two-way consistency leads to an increase in shortest path lengths that scales only linearly with the merging threshold. Experimentally, we show that SGM significantly outperforms current state of the art methods on long horizon, sparse-reward visual navigation tasks. Project video and code are available at https://mishalaskin.github.io/sgm/

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源