通过学习动力学的数据驱动控制：基于模型与无模型的方法

论文标题

通过学习动力学的数据驱动控制：基于模型与无模型的方法

Data Driven Control with Learned Dynamics: Model-Based versus Model-Free Approach

论文作者

Hao, Wenjian, Han, Yiqiang

论文摘要

本文比较了两种不同类型的数据驱动的控制方法，代表基于模型的和无模型的方法。一种是最近提出的方法 - 控制控制（DKRC），它利用深神经网络将未知的非线性动力学系统映射到高维线性系统中，该系统允许采用最先进的控制策略。另一个是一种基于参与者批判性架构 - 深层确定性策略梯度（DDPG）的经典无模型控制方法，该方法已被证明在各种动力学系统中有效。该比较是在OpenAI体育馆进行的，该健身房为基准目的提供了多个控制环境。提供了两个示例，以进行比较，即经典的倒置摆和月球着陆器连续控制。从实验的结果来看，我们根据控制策略和各种初始化条件下的有效性比较了这两种方法。我们还使用源自Euler-Lagrange线性化方法得出的分析模型研究了从DKRC学习的动态模型，该模型证明了从数据驱动的样品有效方法中学习的未知动力学模型中的精度。

This paper compares two different types of data-driven control methods, representing model-based and model-free approaches. One is a recently proposed method - Deep Koopman Representation for Control (DKRC), which utilizes a deep neural network to map an unknown nonlinear dynamical system to a high-dimensional linear system, which allows for employing state-of-the-art control strategy. The other one is a classic model-free control method based on an actor-critic architecture - Deep Deterministic Policy Gradient (DDPG), which has been proved to be effective in various dynamical systems. The comparison is carried out in OpenAI Gym, which provides multiple control environments for benchmark purposes. Two examples are provided for comparison, i.e., classic Inverted Pendulum and Lunar Lander Continuous Control. From the results of the experiments, we compare these two methods in terms of control strategies and the effectiveness under various initialization conditions. We also examine the learned dynamic model from DKRC with the analytical model derived from the Euler-Lagrange Linearization method, which demonstrates the accuracy in the learned model for unknown dynamics from a data-driven sample-efficient approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题