Adail：自适应对抗性模仿学习

论文标题

Adail：自适应对抗性模仿学习

ADAIL: Adaptive Adversarial Imitation Learning

论文作者

Lu, Yiren, Tompson, Jonathan

论文摘要

我们通过模仿从单个源域收集的少量演示，介绍了可以在不同动态环境之间传递的自适应对手模仿学习（ADAIL）算法，可以在不同动态的环境之间传递。这是机器人学习中的一个重要问题，因为在现实世界中的情况1）很难获得奖励功能，2）由于目标域统计数据的不同来源，很难在另一个领域的学习政策在另一个领域部署，3）3）在多种环境中收集专家演示，在多种环境中已知和控制的动态均经常可获得。我们通过基于对抗性模仿学习的最新进展来解决这些限制；我们将政策置于学习的动态嵌入，并采用域 - 逆转损失来学习动态不变歧视者。在具有不同环境动力学的模拟控制任务上证明了我们方法的有效性，而学习的自适应剂优于最近的几个基线。

We present the ADaptive Adversarial Imitation Learning (ADAIL) algorithm for learning adaptive policies that can be transferred between environments of varying dynamics, by imitating a small number of demonstrations collected from a single source domain. This is an important problem in robotic learning because in real world scenarios 1) reward functions are hard to obtain, 2) learned policies from one domain are difficult to deploy in another due to varying source to target domain statistics, 3) collecting expert demonstrations in multiple environments where the dynamics are known and controlled is often infeasible. We address these constraints by building upon recent advances in adversarial imitation learning; we condition our policy on a learned dynamics embedding and we employ a domain-adversarial loss to learn a dynamics-invariant discriminator. The effectiveness of our method is demonstrated on simulated control tasks with varying environment dynamics and the learned adaptive agent outperforms several recent baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题