安全学习MPC具有有限的模型知识和数据

论文标题

安全学习MPC具有有限的模型知识和数据

Safe Learning MPC with Limited Model Knowledge and Data

论文作者

Kandel, Aaron, Moura, Scott J.

论文摘要

本文使用非线性随机MPC和分布强大的优化（DRO）提出了用于基于安全学习的控制（LBC）的端到端框架。这项工作是由LBC文献中的几个公开挑战所激发的。特别是，许多控制理论LBC方法都需要主题专业知识才能翻译自己的安全保证，通常表现为安全轨迹或结构模型知识的先前数据。在本文中，我们专注于LBC，其中控制器直接应用于其没有或极为有限的直接经验的系统，在\ textit {tabula-rasa}或``\ textit {tabula-rasa}或``\'\ textIt {空白slate'}基于模型的学习和控制方面是验证的挑战案例。这探讨了与主题专业知识要求有关的控制理论中现状的边界。我们在基本问题和有限的基本问题上表明，我们可以使用随机MPC和DRO文献中的结果对非线性系统的可行性进行概率保证，并在数学分析中正式扩展其相关性。我们还提出了激发持续性（POE）的耦合和直观的表述，并说明了POE与所提出方法的适用性之间的联系。我们对避免车辆障碍物的案例研究和安全的锂离子电池的安全极端充电揭示了支持基础DRO理论的强大经验结果。我们的方法广泛适用于LBC域内，例如机载风能系统，避免车辆障碍物和能源存储系统管理。它也适用于量化LBC案例以外的不确定性。

This paper presents an end-to-end framework for safe learning-based control (LbC) using nonlinear stochastic MPC and distributionally robust optimization (DRO). This work is motivated by several open challenges in LbC literature. In particular, many control-theoretic LbC methods require subject matter expertise in order to translate their own safety guarantees, often manifested as preexisting data of safe trajectories or structural model knowledge. In this paper, we focus on LbC where the controller is applied directly to a system of which it has no or extremely limited direct experience, towards safety during \textit{tabula-rasa} or ``\textit{blank slate''} model-based learning and control as a challenging case for validation. This explores the boundary of the status-quo in control theory relating to requirements for subject matter expertise. We show under basic and limited assumptions on the underlying problem, we can translate probabilistic guarantees on feasibility to nonlinear systems using results in stochastic MPC and DRO literature whose relevance we formally extend in a mathematical analysis. We also present a coupled and intuitive formulation for persistence of excitation (PoE), and illustrate the connection between PoE and applicability of the proposed method. Our case studies of vehicle obstacle avoidance and safe extreme fast charging of lithium-ion batteries reveal powerful empirical results supporting the underlying DRO theory. Our method is widely applicable within the LbC domain to, for example, airborne wind energy systems, vehicle obstacle avoidance, and energy storage systems management. It is also applicable to quantifying uncertainty beyond the LbC case.

下载PDF全文

下载文献需遵守相关版权规定

论文标题