论文标题

自动编码对抗性模仿学习

Auto-Encoding Adversarial Imitation Learning

论文作者

Zhang, Kaifeng, Zhao, Rui, Zhang, Ziming, Gao, Yang

论文摘要

加强学习(RL)为决策提供了一个有力的框架,但是其在实践中的应用通常需要精心设计的奖励功能。对抗性模仿学习(AIL)阐明了自动策略获取,而无需从环境中访问奖励信号。在这项工作中,我们提出了自动编码对抗性模仿学习(AEAIL),这是一个可靠且可扩展的AIL框架。为了从演示中引起专家策略,AEAIL利用自动编码器的重建误差作为奖励信号,该奖励信号比以前的基于歧视者提供了更多以优化策略的信息。随后,我们使用派生的目标函数来训练自动编码器和代理策略。实验表明,与基于状态和基于图像的环境上的最新方法相比,我们的AEAIL表现优越。更重要的是,当专家演示嘈杂时,AEAIL表现出更好的鲁棒性。

Reinforcement learning (RL) provides a powerful framework for decision-making, but its application in practice often requires a carefully designed reward function. Adversarial Imitation Learning (AIL) sheds light on automatic policy acquisition without access to the reward signal from the environment. In this work, we propose Auto-Encoding Adversarial Imitation Learning (AEAIL), a robust and scalable AIL framework. To induce expert policies from demonstrations, AEAIL utilizes the reconstruction error of an auto-encoder as a reward signal, which provides more information for optimizing policies than the prior discriminator-based ones. Subsequently, we use the derived objective functions to train the auto-encoder and the agent policy. Experiments show that our AEAIL performs superior compared to state-of-the-art methods on both state and image based environments. More importantly, AEAIL shows much better robustness when the expert demonstrations are noisy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源