学习模型预测控制器具有实时关注现实世界导航

论文标题

学习模型预测控制器具有实时关注现实世界导航

Learning Model Predictive Controllers with Real-Time Attention for Real-World Navigation

论文作者

Xiao, Xuesu, Zhang, Tingnan, Choromanski, Krzysztof, Lee, Edward, Francis, Anthony, Varley, Jake, Tu, Stephen, Singh, Sumeet, Xu, Peng, Xia, Fei, Persson, Sven Mikael, Kalashnikov, Dmitry, Takayama, Leila, Frostig, Roy, Tan, Jie, Parada, Carolina, Sindhwani, Vikas

论文摘要

尽管进行了数十年的研究，但现有的导航系统在野外部署时仍面临现实世界中的挑战，例如在混乱的家庭环境或人类占领的公共场所中。为了解决这个问题，我们提出了一类新的隐式控制策略，将模仿学习的好处与从模型预测控制（MPC）中的系统约束处理相结合的好处。我们的方法称为Performer-MPC，使用了通过表演者提供的视觉上下文嵌入的学习成本函数（一种低级隐式意见变压器）。我们共同训练成本函数并构建依靠它的控制器，有效地端到端解决相应的双层优化问题。我们表明，由此产生的策略通过利用在不同挑战的现实世界情景中利用一些专家证明来提高标准MPC绩效。与标准的MPC政策相比，表演者MPC在混乱的环境中实现了40％的目标，而在人类浏览时，社交指标的目标> 65％。

Despite decades of research, existing navigation systems still face real-world challenges when deployed in the wild, e.g., in cluttered home environments or in human-occupied public spaces. To address this, we present a new class of implicit control policies combining the benefits of imitation learning with the robust handling of system constraints from Model Predictive Control (MPC). Our approach, called Performer-MPC, uses a learned cost function parameterized by vision context embeddings provided by Performers -- a low-rank implicit-attention Transformer. We jointly train the cost function and construct the controller relying on it, effectively solving end-to-end the corresponding bi-level optimization problem. We show that the resulting policy improves standard MPC performance by leveraging a few expert demonstrations of the desired navigation behavior in different challenging real-world scenarios. Compared with a standard MPC policy, Performer-MPC achieves >40% better goal reached in cluttered environments and >65% better on social metrics when navigating around humans.

下载PDF全文

下载文献需遵守相关版权规定

论文标题