基于深厚的强化学习和历史驾驶经验的人类能量管理

论文标题

基于深厚的强化学习和历史驾驶经验的人类能量管理

Human-like Energy Management Based on Deep Reinforcement Learning and Historical Driving Experiences

论文作者

Chen, Hao, Tang, Xiaolin, Hu, Guo, Liu, Teng

论文摘要

混合动力汽车的开发取决于先进有效的能源管理策略（EMS）。考虑到在线和实时要求，本文根据深厚的增强学习方法和收集的历史驾驶数据，为混合动力汽车提供了类似人类的能源管理框架。所研究的混合动力总成具有串联拓扑，其面向控制的建模首先是建立的。然后，引入了独特的深入强化学习（DRL）算法，称为“深层确定性政策梯度（DDPG）”。为了增强DRL框架中的派生功率拆分控件，从动态编程（DP）获得的全局最佳控制轨迹被视为训练DDPG模型的专家知识。此操作确保了拟议的控制体系结构的最佳性。此外，采用基于经验丰富的驱动因素的历史驾驶数据来替代基于DP的控件，从而构建类似人类的EMS。最后，执行不同类别的实验，以估计所提出的类人类EMS的最佳性和适应性。燃油经济性和收敛速率的改善表明构建的控制结构的有效性。

Development of hybrid electric vehicles depends on an advanced and efficient energy management strategy (EMS). With online and real-time requirements in mind, this article presents a human-like energy management framework for hybrid electric vehicles according to deep reinforcement learning methods and collected historical driving data. The hybrid powertrain studied has a series-parallel topology, and its control-oriented modeling is founded first. Then, the distinctive deep reinforcement learning (DRL) algorithm, named deep deterministic policy gradient (DDPG), is introduced. To enhance the derived power split controls in the DRL framework, the global optimal control trajectories obtained from dynamic programming (DP) are regarded as expert knowledge to train the DDPG model. This operation guarantees the optimality of the proposed control architecture. Moreover, the collected historical driving data based on experienced drivers are employed to replace the DP-based controls, and thus construct the human-like EMSs. Finally, different categories of experiments are executed to estimate the optimality and adaptability of the proposed human-like EMS. Improvements in fuel economy and convergence rate indicate the effectiveness of the constructed control structure.

下载PDF全文

下载文献需遵守相关版权规定

论文标题