通过语义观察从演示中学习导航成本

论文标题

通过语义观察从演示中学习导航成本

Learning Navigation Costs from Demonstration with Semantic Observations

论文作者

Wang, Tianyu, Dhiman, Vikas, Atanasov, Nikolay

论文摘要

本文着重于使用语义观察的自动机器人导航的逆增强学习（IRL）。目的是推断出一种成本函数，该成本函数在仅依靠专家的观察结果和状态控制轨迹的同时说明了行为。我们开发了一个地图编码器，该编码器会从观察顺序进行语义类概率，而成本编码器则定义为语义特征上的深神经网络。由于无法直接观察到的专家成本，因此只能通过区分演示控件和根据成本估算计算的控制策略之间的误差来优化表示参数。使用仅通过运动计划算法在有希望状态的子集中计算出的封闭形式的亚级别来优化该错误。我们表明，我们的方法学会了通过依靠汽车，人行道和道路车道的语义观察来遵守自动驾驶卡拉模拟器中的交通规则。

This paper focuses on inverse reinforcement learning (IRL) for autonomous robot navigation using semantic observations. The objective is to infer a cost function that explains demonstrated behavior while relying only on the expert's observations and state-control trajectory. We develop a map encoder, which infers semantic class probabilities from the observation sequence, and a cost encoder, defined as deep neural network over the semantic features. Since the expert cost is not directly observable, the representation parameters can only be optimized by differentiating the error between demonstrated controls and a control policy computed from the cost estimate. The error is optimized using a closed-form subgradient computed only over a subset of promising states via a motion planning algorithm. We show that our approach learns to follow traffic rules in the autonomous driving CARLA simulator by relying on semantic observations of cars, sidewalks and road lanes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题