数据驱动的随机最佳控制使用内核梯度

论文标题

数据驱动的随机最佳控制使用内核梯度

Data-Driven Stochastic Optimal Control Using Kernel Gradients

论文作者

Thorpe, Adam J., Gonzales, Jake A., Oishi, Meeko M. K.

论文摘要

我们提出了一种基于经验的，基于梯度的方法，用于使用分布的内核嵌入理论来解决数据驱动的随机最佳控制问题。通过将随机核的积分操作员嵌入复制的内核Hilbert空间中，我们可以计算随机最佳控制问题的经验近似，然后可以使用RKHS的性质有效地求解。现有方法通常依靠有限的控制空间或在有限支持的情况下优化政策以实现优化。相比之下，我们的方法使用使用观察到的数据计算的基于内核的梯度来近似最佳控制问题的成本表面，然后可以使用梯度下降进行优化。我们将技术应用于数据驱动的随机最佳控制区域，并在线性调节问题上进行比较和非线性目标跟踪问题，证明了我们提出的方法。

We present an empirical, gradient-based method for solving data-driven stochastic optimal control problems using the theory of kernel embeddings of distributions. By embedding the integral operator of a stochastic kernel in a reproducing kernel Hilbert space, we can compute an empirical approximation of stochastic optimal control problems, which can then be solved efficiently using the properties of the RKHS. Existing approaches typically rely upon finite control spaces or optimize over policies with finite support to enable optimization. In contrast, our approach uses kernel-based gradients computed using observed data to approximate the cost surface of the optimal control problem, which can then be optimized using gradient descent. We apply our technique to the area of data-driven stochastic optimal control, and demonstrate our proposed approach on a linear regulation problem for comparison and on a nonlinear target tracking problem.

下载PDF全文

下载文献需遵守相关版权规定

论文标题