通过减少基于神经ODE的模型和增强学习的数据驱动控制时空混乱的控制

论文标题

通过减少基于神经ODE的模型和增强学习的数据驱动控制时空混乱的控制

Data-driven control of spatiotemporal chaos with reduced-order neural ODE-based models and reinforcement learning

论文作者

Zeng, Kevin, Linot, Alec J., Graham, Michael D.

论文摘要

深钢筋学习（RL）是一种数据驱动的方法，能够发现高维系统的复杂控制策略，从而使其对流量控制应用有望。特别是，目前的工作是由减少湍流中能量耗散的目标的动机，而考虑的示例是库拉莫托 - 苏瓦辛斯基方程（KSE）的时空混乱动力学。与RL相关的一个主要挑战是，必须通过与目标系统反复交互来生成大量的培训数据，从而使系统在计算或实验上昂贵时成本高昂。我们通过将自动编码器与神经ODE框架相结合的维度降低来减少尺寸，从而以数据驱动的方式来减轻这一挑战，从而从有限的数据集中获得了低维动力学模型。我们在RL培训期间代替了该数据驱动的还原模型（ROM）来有效估计最佳策略，然后可以将其部署在真实的系统上。对于在四个位置使用局部强迫（“喷气机”）驱动的KSE，我们证明了我们能够学习一个ROM，可以准确地捕获动态动力学以及仅从KSE的快照中经历随机驱动的快照的基本自然动态。使用此ROM以及最大程度地减少耗散和功率成本的控制目标，我们使用Deep RL从其中提取控制策略。我们表明，基于ROM的控制策略可以很好地转化为真正的KSE，并强调了RL代理发现并稳定KSE系统的潜在强迫平衡解决方案。我们表明，这种在ROM中捕获并通过RL发现的强制平衡与天然KSE的现有已知平衡解决方案有关。

Deep reinforcement learning (RL) is a data-driven method capable of discovering complex control strategies for high-dimensional systems, making it promising for flow control applications. In particular, the present work is motivated by the goal of reducing energy dissipation in turbulent flows, and the example considered is the spatiotemporally chaotic dynamics of the Kuramoto-Sivashinsky equation (KSE). A major challenge associated with RL is that substantial training data must be generated by repeatedly interacting with the target system, making it costly when the system is computationally or experimentally expensive. We mitigate this challenge in a data-driven manner by combining dimensionality reduction via an autoencoder with a neural ODE framework to obtain a low-dimensional dynamical model from just a limited data set. We substitute this data-driven reduced-order model (ROM) in place of the true system during RL training to efficiently estimate the optimal policy, which can then be deployed on the true system. For the KSE actuated with localized forcing ("jets") at four locations, we demonstrate that we are able to learn a ROM that accurately captures the actuated dynamics as well as the underlying natural dynamics just from snapshots of the KSE experiencing random actuations. Using this ROM and a control objective of minimizing dissipation and power cost, we extract a control policy from it using deep RL. We show that the ROM-based control strategy translates well to the true KSE and highlight that the RL agent discovers and stabilizes an underlying forced equilibrium solution of the KSE system. We show that this forced equilibrium captured in the ROM and discovered through RL is related to an existing known equilibrium solution of the natural KSE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题