适应性的遗憾，以控制时间变化的动态

论文标题

适应性的遗憾，以控制时间变化的动态

Adaptive Regret for Control of Time-Varying Dynamics

论文作者

Gradu, Paula, Hazan, Elad, Minasyan, Edgar

论文摘要

我们考虑使用时变线性动力学的在线控制系统的问题。这是一种通用公式，它是由使用局部线性化控制非线性动态系统的动机。为了在不断变化的环境中陈述有意义的保证，我们将{\ it自适应后悔}的指标介绍给控制领域。该指标最初是在在线学习中研究的，它以对{\ IT任何时间间隔}的最佳政策的遗憾来衡量性能，从而捕获了控制器对改变动态的适应。我们的主要贡献是一种新颖的有效的元容量：它将带有sublinear后悔界限的控制器转换为具有sublrinear {\ it自适应遗憾}界限的控制器，在时间变化的线性动力学系统的设置中。主要的技术创新是第一个自适应遗憾，即使用内存的在线凸优化的更通用的框架。此外，我们给出了一个下限，表明我们所达到的自适应遗憾束缚在这个一般框架中几乎很紧。

We consider the problem of online control of systems with time-varying linear dynamics. This is a general formulation that is motivated by the use of local linearization in control of nonlinear dynamical systems. To state meaningful guarantees over changing environments, we introduce the metric of {\it adaptive regret} to the field of control. This metric, originally studied in online learning, measures performance in terms of regret against the best policy in hindsight on {\it any interval in time}, and thus captures the adaptation of the controller to changing dynamics. Our main contribution is a novel efficient meta-algorithm: it converts a controller with sublinear regret bounds into one with sublinear {\it adaptive regret} bounds in the setting of time-varying linear dynamical systems. The main technical innovation is the first adaptive regret bound for the more general framework of online convex optimization with memory. Furthermore, we give a lower bound showing that our attained adaptive regret bound is nearly tight for this general framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题