学习一个安全区域的低维表示，以在动态系统上学习安全增强

论文标题

学习一个安全区域的低维表示，以在动态系统上学习安全增强

Learning a Low-dimensional Representation of a Safe Region for Safe Reinforcement Learning on Dynamical Systems

论文作者

Zhou, Zhehua, Oguz, Ozgur S., Leibold, Marion, Buss, Martin

论文摘要

为了在高维非线性动力学系统上安全地应用增强学习算法，简化的系统模型用于制定安全的增强学习框架。基于简化的系统模型，确定了安全区域的低维表示，并用于为学习算法提供安全估计。但是，为复杂的动态系统找到令人满意的简化系统模型通常需要大量的努力。为了克服这一限制，我们在这项工作中提出了一种通用数据驱动的方法，该方法能够有效地学习安全区域的低维度。通过在线适应方法，通过使用反馈数据来更新低维表示，以便获得更准确的安全性估计值。四轮驱动器的示例证明了识别安全区域低维表示的建议方法的性能。结果表明，与以前的工作相比，得出了安全区域的更可靠和代表性的低维表示，然后扩展了安全加固学习框架的适用性。

For safely applying reinforcement learning algorithms on high-dimensional nonlinear dynamical systems, a simplified system model is used to formulate a safe reinforcement learning framework. Based on the simplified system model, a low-dimensional representation of the safe region is identified and is used to provide safety estimates for learning algorithms. However, finding a satisfying simplified system model for complex dynamical systems usually requires a considerable amount of effort. To overcome this limitation, we propose in this work a general data-driven approach that is able to efficiently learn a low-dimensional representation of the safe region. Through an online adaptation method, the low-dimensional representation is updated by using the feedback data such that more accurate safety estimates are obtained. The performance of the proposed approach for identifying the low-dimensional representation of the safe region is demonstrated with a quadcopter example. The results show that, compared to previous work, a more reliable and representative low-dimensional representation of the safe region is derived, which then extends the applicability of the safe reinforcement learning framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题