论文标题
数据驱动的未知系统控制:线性编程方法
Data-Driven Control of Unknown Systems: A Linear Programming Approach
论文作者
论文摘要
我们考虑对一般未知确定性离散时间系统的最佳状态反馈调节的折现问题。众所周知,系统,非二次成本功能和复杂的非线性动态的开环不稳定性以及许多强化学习(RL)算法的政策行为,使无模型的最佳自适应控制器的设计成为一个具有挑战性的任务。我们脱离了常规无模型控制理论中常用的最小二乘和神经网络近似方法,并提出了基于线性编程,非政策Q-学习和随机体验重播的新型数据驱动优化算法。我们同时开发策略迭代(PI)和价值迭代(VI)方法,以高精度计算近似的最佳反馈控制器,并且不了解系统模型和舞台成本功能。仿真研究证实了所提出的方法的有效性。
We consider the problem of discounted optimal state-feedback regulation for general unknown deterministic discrete-time systems. It is well known that open-loop instability of systems, non-quadratic cost functions and complex nonlinear dynamics, as well as the on-policy behavior of many reinforcement learning (RL) algorithms, make the design of model-free optimal adaptive controllers a challenging task. We depart from commonly used least-squares and neural network approximation methods in conventional model-free control theory, and propose a novel family of data-driven optimization algorithms based on linear programming, off-policy Q-learning and randomized experience replay. We develop both policy iteration (PI) and value iteration (VI) methods to compute an approximate optimal feedback controller with high precision and without the knowledge of a system model and stage cost function. Simulation studies confirm the effectiveness of the proposed methods.