高斯流程状态空间模型的本地积极学习

论文标题

高斯流程状态空间模型的本地积极学习

Localized active learning of Gaussian process state space models

论文作者

Capone, Alexandre, Umlauft, Jonas, Beckers, Thomas, Lederer, Armin, Hirche, Sandra

论文摘要

基于学习的控制技术的性能至关重要地取决于系统的有效探索方式。尽管大多数探索技术旨在实现全球精确的模型，但这种方法通常不适合具有无限状态空间的系统。此外，在许多常见的控制应用（例如本地稳定任务）中，不需要全球精确的模型来实现良好的性能。在本文中，我们为高斯流程状态空间模型提出了一种主动学习策略，该策略旨在在国家行动空间的有限子集上获得准确的模型。我们的方法旨在最大程度地提高勘探轨迹的相互信息，以实现关注区域的离散化。通过采用模型预测控制，该提出的技术整合了在探索过程中收集的信息并自适应改善其勘探策略。为了启用计算障碍，我们将最有用的数据点从模型预测控制优化步骤中解脱出来。这产生了两个可以并行解决的优化问题。我们应用了建议的方法来探索各种动态系统的状态空间，并将我们的方法与常用的基于熵的探索策略进行比较。在所有实验中，我们的方法在感兴趣的区域中产生的模型比基于熵的方法更好。

The performance of learning-based control techniques crucially depends on how effectively the system is explored. While most exploration techniques aim to achieve a globally accurate model, such approaches are generally unsuited for systems with unbounded state spaces. Furthermore, a globally accurate model is not required to achieve good performance in many common control applications, e.g., local stabilization tasks. In this paper, we propose an active learning strategy for Gaussian process state space models that aims to obtain an accurate model on a bounded subset of the state-action space. Our approach aims to maximize the mutual information of the exploration trajectories with respect to a discretization of the region of interest. By employing model predictive control, the proposed technique integrates information collected during exploration and adaptively improves its exploration strategy. To enable computational tractability, we decouple the choice of most informative data points from the model predictive control optimization step. This yields two optimization problems that can be solved in parallel. We apply the proposed method to explore the state space of various dynamical systems and compare our approach to a commonly used entropy-based exploration strategy. In all experiments, our method yields a better model within the region of interest than the entropy-based method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题