论文标题

逐步生长自组织的分层表示探索

Progressive growing of self-organized hierarchical representations for exploration

论文作者

Etcheverry, Mayalen, Oudeyer, Pierre-Yves, Reinke, Chris

论文摘要

可以自主发现和学习多种结构和技能在未知不明的环境中的设计代理是终身机器学习的关键。一个核心挑战是如何学习逐步表示形式,以逐步构建发现的结构的地图并重新使用它以进一步探索。为了应对这一挑战,我们识别并针对几个关键功能。首先,我们旨在建立持久的表示,并避免在整个探索过程中灾难性遗忘。其次,我们旨在学习各种表示,允许在复杂的高维环境中发现结构(和相关技能)的“多样性”。第三,我们针对可以用粗到精细的方式构造代理发现的表示。最后,我们以这种表示形式为目标,以推动探索“有趣的”多样性类型,例如利用人类的指导。国家表示学习中的当前方法通常取决于无法实现所有这些功能的整体体系结构。因此,我们提出了一种新型技术,可以逐步构建观测潜在模型的层次结构,用于探索分层,称为Holmes。该技术将动态模块模型体系结构与内在动机探索过程(IMGEP)相结合。该论文显示了自动发现不同自组织模式的领域,考虑到Reinke等人的实验框架。 (2019)。

Designing agent that can autonomously discover and learn a diversity of structures and skills in unknown changing environments is key for lifelong machine learning. A central challenge is how to learn incrementally representations in order to progressively build a map of the discovered structures and re-use it to further explore. To address this challenge, we identify and target several key functionalities. First, we aim to build lasting representations and avoid catastrophic forgetting throughout the exploration process. Secondly we aim to learn a diversity of representations allowing to discover a "diversity of diversity" of structures (and associated skills) in complex high-dimensional environments. Thirdly, we target representations that can structure the agent discoveries in a coarse-to-fine manner. Finally, we target the reuse of such representations to drive exploration toward an "interesting" type of diversity, for instance leveraging human guidance. Current approaches in state representation learning rely generally on monolithic architectures which do not enable all these functionalities. Therefore, we present a novel technique to progressively construct a Hierarchy of Observation Latent Models for Exploration Stratification, called HOLMES. This technique couples the use of a dynamic modular model architecture for representation learning with intrinsically-motivated goal exploration processes (IMGEPs). The paper shows results in the domain of automated discovery of diverse self-organized patterns, considering as testbed the experimental framework from Reinke et al. (2019).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源