论文标题

遗憾的是在部分可观察的线性二次控制中最小化

Regret Minimization in Partially Observable Linear Quadratic Control

论文作者

Lale, Sahin, Azizzadenesheli, Kamyar, Hassibi, Babak, Anandkumar, Anima

论文摘要

当模型动力学未知的先验动力学时,我们研究了部分可观察到的线性二次控制系统中遗憾最小化的问题。我们提出了ExpCommit,这是一种探索 - 启动算法,该算法了解Markov参数,然后在面对不确定性设计控制器时遵循乐观原理。我们提出了一种新颖的方式来分解遗憾,并为部分可观察到的线性二次控制提供了端到端的Sublerear遗憾上限。最后,我们提供稳定性保证,并为$ \ tilde {\ Mathcal {o}}}(t^{2/3})$建立遗憾的上限,其中$ t $是问题的时间范围。

We study the problem of regret minimization in partially observable linear quadratic control systems when the model dynamics are unknown a priori. We propose ExpCommit, an explore-then-commit algorithm that learns the model Markov parameters and then follows the principle of optimism in the face of uncertainty to design a controller. We propose a novel way to decompose the regret and provide an end-to-end sublinear regret upper bound for partially observable linear quadratic control. Finally, we provide stability guarantees and establish a regret upper bound of $\tilde{\mathcal{O}}(T^{2/3})$ for ExpCommit, where $T$ is the time horizon of the problem.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源