使用深钢筋学习，用于信号动脉交叉点的自适应协调偏移

论文标题

使用深钢筋学习，用于信号动脉交叉点的自适应协调偏移

Adaptive Coordination Offsets for Signalized Arterial Intersections using Deep Reinforcement Learning

论文作者

Diaz, Keith Anshilo, Dailisan, Damian, Sharaf, Umang, Santos, Carissa, Gan, Qijian, Uy, Francis Aldrine, Lim, May T., Bayen, Alexandre M.

论文摘要

在动脉网络中协调交叉路口对于城市运输系统的性能至关重要。深度强化学习（RL）以及流量控制系统的数据驱动方法在交通管制研究中获得了吸引力。迄今为止，提出了基于RL的深度流量方案控制阶段激活或持续时间。然而，这种方法可能绕过几个周期的低音量链接，以优化网络级别的流量流。在这里，我们提出了一个深入的RL框架，该框架可以根据交通状态动态调整偏移，并保留从基于模型的方法得出的计划的相位时间和顺序。该框架使我们能够改善动脉协调，同时保持相位顺序和时机可预测性。使用经过验证和校准的流量模型，我们培训了旨在减少网络中旅行延迟的Deep RL代理的政策。我们通过将其绩效与阿卡迪亚市亨廷顿大道一部分部署的阶段偏移进行了比较，评估了最终的政策。最终的策略会动态调整阶段偏移，以应对交通需求的变化。仿真结果表明，所提出的深RL代理平均表现优于基线，在AM方案中有效地将延迟时间降低了13.21％，在中午方案中有2.42％，在PM方案中，当偏移量以15分钟的间隔调整时，在PM方案中有6.2％。最后，我们还显示了代理商对极端交通状况的鲁棒性，例如非高峰时段的需求激增和本地交通事故

Coordinating intersections in arterial networks is critical to the performance of urban transportation systems. Deep reinforcement learning (RL) has gained traction in traffic control research along with data-driven approaches for traffic control systems. To date, proposed deep RL-based traffic schemes control phase activation or duration. Yet, such approaches may bypass low volume links for several cycles in order to optimize the network-level traffic flow. Here, we propose a deep RL framework that dynamically adjusts offsets based on traffic states and preserves the planned phase timings and order derived from model-based methods. This framework allows us to improve arterial coordination while maintaining phase order and timing predictability. Using a validated and calibrated traffic model, we trained the policy of a deep RL agent that aims to reduce travel delays in the network. We evaluated the resulting policy by comparing its performance against the phase offsets deployed along a segment of Huntington Drive in the city of Arcadia. The resulting policy dynamically readjusts phase offsets in response to changes in traffic demand. Simulation results show that the proposed deep RL agent outperformed the baseline on average, effectively reducing delay time by 13.21% in the AM Scenario, 2.42% in the Noon scenario, and 6.2% in the PM scenario when offsets are adjusted in 15-minute intervals. Finally, we also show the robustness of our agent to extreme traffic conditions, such as demand surges in off-peak hours and localized traffic incidents

下载PDF全文

下载文献需遵守相关版权规定

论文标题