使用近似价值迭代的基于学习的终端成本进行预测性控制

论文标题

使用近似价值迭代的基于学习的终端成本进行预测性控制

Predictive Control with Learning-Based Terminal Costs Using Approximate Value Iteration

论文作者

Moreno-Mora, Francisco, Beckenbach, Lukas, Streif, Stefan

论文摘要

模型预测控制下的稳定性（MPC）方案经常通过终端成分来确保。使用（控制）lyapunov作为终端成本构成了一个共同的选择。基于学习的方法可以通过将终端成本与终端成本联系起来，例如，最佳成本是Lyapunov函数，其中无限 - 摩恩最佳控制问题。价值迭代，近似动态编程（ADP）方法，是指一种特定的成本近似技术。在这项工作中，我们合并了最终不受限制的预测控制和近似值迭代的结果，以从这两个领域中汲取益处。预测范围是根据不同因素（例如与近似相关的误差）得出的，以使闭环渐近稳定，从而进一步允许与无限的地平线最佳成本相比，允许次优估计值。结果扩展了对预测控制的最新研究，并不需要局部初始稳定控制器。我们将模拟中的控制器与其他终端成本选项进行了比较，以表明与以前的结果相比，所提出的方法导致最小的地平线较短。

Stability under model predictive control (MPC) schemes is frequently ensured by terminal ingredients. Employing a (control) Lyapunov function as the terminal cost constitutes a common choice. Learning-based methods may be used to construct the terminal cost by relating it to, for instance, an infinite-horizon optimal control problem in which the optimal cost is a Lyapunov function. Value iteration, an approximate dynamic programming (ADP) approach, refers to one particular cost approximation technique. In this work, we merge the results of terminally unconstrained predictive control and approximate value iteration to draw benefits from both fields. A prediction horizon is derived in dependence on different factors such as approximation-related errors to render the closed-loop asymptotically stable further allowing a suboptimality estimate in comparison to an infinite horizon optimal cost. The result extends recent studies on predictive control with ADP-based terminal costs, not requiring a local initial stabilizing controller. We compare this controller in simulation with other terminal cost options to show that the proposed approach leads to a shorter minimal horizon in comparison to previous results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题