论文标题
双层优化增强了有条件的变量自动编码器,用于自动驾驶在浓厚的交通中
Bi-Level Optimization Augmented with Conditional Variational Autoencoder for Autonomous Driving in Dense Traffic
论文作者
论文摘要
自动驾驶具有自然的双层结构。上层行为层的目标是提供适当的车道更改,加速加速和制动决策,以优化给定的驾驶任务。但是,该层只能通过低级轨迹计划者间接影响驱动效率,该轨迹策划者采用行为输入以产生运动命令。现有的基于抽样的方法并不能完全利用行为和计划层之间的强耦合。另一方面,端到端的加固学习(RL)可以学习一个行为层,同时结合了低级计划者的反馈。但是,在看不见的环境中,纯粹由数据驱动的方法通常失败。本文提出了一种新颖的选择。共同计算最佳行为决策和所得下游轨迹的参数化双层优化。我们的方法使用自定义的GPU加速批处理优化器实时运行,并有条件的变异自动编码器学习了温暖启动策略。广泛的模拟表明,我们的方法优于最先进的模型预测性控制和RL方法在碰撞率方面,同时在推动效率方面具有竞争力。
Autonomous driving has a natural bi-level structure. The goal of the upper behavioural layer is to provide appropriate lane change, speeding up, and braking decisions to optimize a given driving task. However, this layer can only indirectly influence the driving efficiency through the lower-level trajectory planner, which takes in the behavioural inputs to produce motion commands. Existing sampling-based approaches do not fully exploit the strong coupling between the behavioural and planning layer. On the other hand, end-to-end Reinforcement Learning (RL) can learn a behavioural layer while incorporating feedback from the lower-level planner. However, purely data-driven approaches often fail in safety metrics in unseen environments. This paper presents a novel alternative; a parameterized bi-level optimization that jointly computes the optimal behavioural decisions and the resulting downstream trajectory. Our approach runs in real-time using a custom GPU-accelerated batch optimizer, and a Conditional Variational Autoencoder learnt warm-start strategy. Extensive simulations show that our approach outperforms state-of-the-art model predictive control and RL approaches in terms of collision rate while being competitive in driving efficiency.