主对自动赛车的分层控制

论文标题

主对自动赛车的分层控制

Hierarchical Control for Head-to-Head Autonomous Racing

论文作者

Thakkar, Rishabh Saumil, Samyal, Aryaman Singh, Fridovich-Keil, David, Xu, Zhe, Topcu, Ufuk

论文摘要

我们为正面自动赛车开发了一个分层控制器。我们首先介绍具有现实安全和公平规则的赛车游戏的表述。高级计划者将原始公式近似为具有简化状态，控制和动态的离散游戏，以轻松编码复杂的安全性和公平性规则并计算一系列目标航路点。低级控制器将产生的路点作为参考轨迹，并通过求解具有简化的目标和约束的替代公式近似来计算高分辨率控制输入。我们考虑了低级规划师的两种方法，它们构建了两个分层控制器。一种方法使用多代理增强学习（MARL），另一种方法求解了线性季度纳什游戏（LQNG）来产生控制输入。将控制器与三个基线进行比较：端到端MARL控制器，跟踪固定赛车线的MARL控制器以及跟踪固定赛车线的LQNG控制器。定量结果表明，所提出的层次结构方法在胜利和遵守规则方面优于其各自的基线方法。使用MARL进行低级控制的分层控制器通过赢得超过90％的头对头比赛并更始终如一地遵守复杂的赛车规则，从而超过了所有其他方法。从定性上讲，我们观察到的拟议的控制器模仿了专家人类驾驶员所采取的措施，例如屏蔽/封锁，超车和长期计划，以延迟优势。我们表明，即使受到复杂的规则和约束挑战，游戏理论推理的层次结构计划也会产生竞争行为。

We develop a hierarchical controller for head-to-head autonomous racing. We first introduce a formulation of a racing game with realistic safety and fairness rules. A high-level planner approximates the original formulation as a discrete game with simplified state, control, and dynamics to easily encode the complex safety and fairness rules and calculates a series of target waypoints. The low-level controller takes the resulting waypoints as a reference trajectory and computes high-resolution control inputs by solving an alternative formulation approximation with simplified objectives and constraints. We consider two approaches for the low-level planner, constructing two hierarchical controllers. One approach uses multi-agent reinforcement learning (MARL), and the other solves a linear-quadratic Nash game (LQNG) to produce control inputs. The controllers are compared against three baselines: an end-to-end MARL controller, a MARL controller tracking a fixed racing line, and an LQNG controller tracking a fixed racing line. Quantitative results show that the proposed hierarchical methods outperform their respective baseline methods in terms of head-to-head race wins and abiding by the rules. The hierarchical controller using MARL for low-level control consistently outperformed all other methods by winning over 90% of head-to-head races and more consistently adhered to the complex racing rules. Qualitatively, we observe the proposed controllers mimicking actions performed by expert human drivers such as shielding/blocking, overtaking, and long-term planning for delayed advantages. We show that hierarchical planning for game-theoretic reasoning produces competitive behavior even when challenged with complex rules and constraints.

下载PDF全文

下载文献需遵守相关版权规定

论文标题