论文标题
使用对象 - 结果潜在的潜在空间探索具有内在动力
Exploration with Intrinsic Motivation using Object-Action-Outcome Latent Space
论文作者
论文摘要
为人工代理提供感觉运动技能的一种有效方法是使用自我探索。有效地做到这一点至关重要,因为时间和数据收集成本很高。在这项研究中,我们提出了一种探索机制,将行动,对象和动作结果表示形式融合到一个潜在空间中,在该空间中形成了本地区域以托管前向模型学习。代理商使用内在动机来选择在给定探索步骤中采用最高学习进度的前向模型。这与婴儿的学习方式相似,因为高学习进步表明,学习问题在所选地区既不容易,也不是太困难。在台式环境中,用模拟机器人验证了所提出的方法。仿真场景包括一个机器人和各种对象,其中机器人每次使用一组参数化的动作与其中一个相互作用,并了解这些相互作用的结果。通过提出的方法,机器人像现有的内在动机方法一样组织其学习课程,并以学习速度优于他们。此外,学习制度展示了部分与婴儿发育相匹配的特征。尤其是,拟议的系统学会了以分阶段的方式预测不同技能的结果。
One effective approach for equipping artificial agents with sensorimotor skills is to use self-exploration. To do this efficiently is critical, as time and data collection are costly. In this study, we propose an exploration mechanism that blends action, object, and action outcome representations into a latent space, where local regions are formed to host forward model learning. The agent uses intrinsic motivation to select the forward model with the highest learning progress to adopt at a given exploration step. This parallels how infants learn, as high learning progress indicates that the learning problem is neither too easy nor too difficult in the selected region. The proposed approach is validated with a simulated robot in a table-top environment. The simulation scene comprises a robot and various objects, where the robot interacts with one of them each time using a set of parameterized actions and learns the outcomes of these interactions. With the proposed approach, the robot organizes its curriculum of learning as in existing intrinsic motivation approaches and outperforms them in learning speed. Moreover, the learning regime demonstrates features that partially match infant development; in particular, the proposed system learns to predict the outcomes of different skills in a staged manner.