应用：自适应计划者参数从干预中学习

论文标题

应用：自适应计划者参数从干预中学习

APPLI: Adaptive Planner Parameter Learning From Interventions

论文作者

Wang, Zizhao, Xiao, Xuesu, Liu, Bo, Warnell, Garrett, Stone, Peter

论文摘要

尽管经典的自主导航系统通常可以安全地将机器人安全地从一个点移到另一个点，但在某些情况下，这些系统可能会失败或产生次优行为。在这种情况下，当前的做法是手动重新调整系统的参数，例如最大速度，采样率，通胀半径，以优化性能。这种做法需要专业知识，并可能危及最初的好场景中的性能。同时，人类相对容易识别那些失败或次优案件并提供远程手术干预以纠正失败或次优行为。在这项工作中，我们寻求从这些人类干预措施中学习以提高导航性能。特别是，我们建议从干预措施（APPLI）中学习自适应计划者参数，其中在培训期间学习了多组导航参数，并根据置信度度量在部署期间对基础导航系统进行应用。在我们的物理实验中，与具有静态默认参数的计划者相比，机器人的性能更好，甚至是从完整的人类演示中学到的动态参数。我们还显示了Appli在另一个看不见的物理测试课程中的普遍性，以及300个模拟导航环境的套件。

While classical autonomous navigation systems can typically move robots from one point to another safely and in a collision-free manner, these systems may fail or produce suboptimal behavior in certain scenarios. The current practice in such scenarios is to manually re-tune the system's parameters, e.g. max speed, sampling rate, inflation radius, to optimize performance. This practice requires expert knowledge and may jeopardize performance in the originally good scenarios. Meanwhile, it is relatively easy for a human to identify those failure or suboptimal cases and provide a teleoperated intervention to correct the failure or suboptimal behavior. In this work, we seek to learn from those human interventions to improve navigation performance. In particular, we propose Adaptive Planner Parameter Learning from Interventions (APPLI), in which multiple sets of navigation parameters are learned during training and applied based on a confidence measure to the underlying navigation system during deployment. In our physical experiments, the robot achieves better performance compared to the planner with static default parameters, and even dynamic parameters learned from a full human demonstration. We also show APPLI's generalizability in another unseen physical test course, and a suite of 300 simulated navigation environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题