论文标题
基于逆资源理性的随机驱动程序行为模型
Inverse Resource Rational Based Stochastic Driver Behavior Model
论文作者
论文摘要
在现实世界中的交通情况下做出决策时,人类驾驶员的认知资源有限且变化,这通常会导致独特而随机的行为,无法通过完美理性的假设来解释,这是在建模驾驶行为方面被广泛接受的前提,这些前提是在所有情况下都在合理地做出决定,以最大程度地提高自己的决定。为了明确解决这一缺点,本研究提出了一种新型的驾驶员行为模型,旨在在现实的纵向驾驶场景中捕获人类驾驶员行为的资源理性和随机性。资源合理性原则可以提供理论框架,以通过将人的内部认知机制建模为实用性最大化,从而更好地理解人类的认知过程,但要受认知资源的限制,这可以表示为有限的和时间变化的预览范围。提出了一种基于资源的基于资源理性的随机逆强化学习方法(IRR-SIRL),以通过一系列人的示范来学习人类驾驶员的计划范围和成本函数的分布。具有随时间变化的地平线方法的非线性模型预测控制(NMPC)通过使用计划范围的学习分布和驱动程序的成本函数来生成特定于驱动程序的轨迹。使用从驾驶员驾驶模拟器收集的人类示范进行模拟实验。结果表明,所提出的基于资源的基于资源的随机驱动器模型可以在各种现实的纵向驾驶场景中解决人类驾驶行为的资源合理性和随机性。
Human drivers have limited and time-varying cognitive resources when making decisions in real-world traffic scenarios, which often leads to unique and stochastic behaviors that can not be explained by perfect rationality assumption, a widely accepted premise in modeling driving behaviors that presume drivers rationally make decisions to maximize their own rewards under all circumstances. To explicitly address this disadvantage, this study presents a novel driver behavior model that aims to capture the resource rationality and stochasticity of the human driver's behaviors in realistic longitudinal driving scenarios. The resource rationality principle can provide a theoretic framework to better understand the human cognition processes by modeling human's internal cognitive mechanisms as utility maximization subject to cognitive resource limitations, which can be represented as finite and time-varying preview horizons in the context of driving. An inverse resource rational-based stochastic inverse reinforcement learning approach (IRR-SIRL) is proposed to learn a distribution of the planning horizon and cost function of the human driver with a given series of human demonstrations. A nonlinear model predictive control (NMPC) with a time-varying horizon approach is used to generate driver-specific trajectories by using the learned distributions of the planning horizon and the cost function of the driver. The simulation experiments are carried out using human demonstrations gathered from the driver-in-the-loop driving simulator. The results reveal that the proposed inverse resource rational-based stochastic driver model can address the resource rationality and stochasticity of human driving behaviors in a variety of realistic longitudinal driving scenarios.