港口哈米尔顿港系统的最佳控制：一种连续的学习方法

论文标题

港口哈米尔顿港系统的最佳控制：一种连续的学习方法

Optimal Control of Port-Hamiltonian Systems: A Time-Continuous Learning Approach

论文作者

Kölsch, Lukas, Soneira, Pol Jané, Strehle, Felix, Hohmann, Sören

论文摘要

港口哈米尔顿港系统的反馈控制器揭示了固有的逆最优性属性，因为相对于某些特定的性能指数，每个钝化状态反馈控制器都是最佳的。然而，由于非线性港口 - 哈米尔顿系统结构，明确的（正向）方法是对港口 - 哈米尔顿港系统的最佳控制，需要对汉密尔顿 - 雅各布 - 贝尔曼方程的一般棘手的分析解决方案。自适应动态编程方法提供了一种解决此问题的方法。但是，少数现有的 - 哈米尔顿港系统的方法取决于性能指数或系统动态的非常特定的子类，或者需要对稳定初始权重的内在猜测。在本文中，我们通过提出一个时间连续的自适应反馈控制器来结束这个很大程度上未探索的研究领域，以最佳控制一般时间连续的输入状态 - 港口港口港口 - 哈米尔顿港系统，以相对于一般的拉格朗日绩效。它的控制法实现了一种在线学习程序，该程序将系统的哈密顿量作为初始价值函数候选者。价值函数的时间连续学习是通过某个Lagrange乘法器来实现的，该乘法器允许评估当前解决方案的最佳性。特别是，陈述了稳定初始权重的建设性条件，并证明了闭环平衡的渐近稳定性。通过模拟线性和非线性优化问题的模拟，我们的工作得出了结论，这些线性和非线性优化问题表明了由提议的在线适应程序引起的控制器的渐近收敛性。

Feedback controllers for port-Hamiltonian systems reveal an intrinsic inverse optimality property since each passivating state feedback controller is optimal with respect to some specific performance index. Due to the nonlinear port-Hamiltonian system structure, however, explicit (forward) methods for optimal control of port-Hamiltonian systems require the generally intractable analytical solution of the Hamilton-Jacobi-Bellman equation. Adaptive dynamic programming methods provide a means to circumvent this issue. However, the few existing approaches for port-Hamiltonian systems hinge on very specific sub-classes of either performance indices or system dynamics or require the intransparent guessing of stabilizing initial weights. In this paper, we contribute towards closing this largely unexplored research area by proposing a time-continuous adaptive feedback controller for the optimal control of general time-continuous input-state-output port-Hamiltonian systems with respect to general Lagrangian performance indices. Its control law implements an online learning procedure which uses the Hamiltonian of the system as an initial value function candidate. The time-continuous learning of the value function is achieved by means of a certain Lagrange multiplier that allows to evaluate the optimality of the current solution. In particular, constructive conditions for stabilizing initial weights are stated and asymptotic stability of the closed-loop equilibrium is proven. Our work is concluded by simulations for exemplary linear and nonlinear optimization problems which demonstrate asymptotic convergence of the controllers resulting from the proposed online adaptation procedure.

下载PDF全文

下载文献需遵守相关版权规定

论文标题