论文标题
适应性的遗憾,以控制时间变化的动态
Adaptive Regret for Control of Time-Varying Dynamics
论文作者
论文摘要
我们考虑使用时变线性动力学的在线控制系统的问题。这是一种通用公式,它是由使用局部线性化控制非线性动态系统的动机。为了在不断变化的环境中陈述有意义的保证,我们将{\ it自适应后悔}的指标介绍给控制领域。该指标最初是在在线学习中研究的,它以对{\ IT任何时间间隔}的最佳政策的遗憾来衡量性能,从而捕获了控制器对改变动态的适应。 我们的主要贡献是一种新颖的有效的元容量:它将带有sublinear后悔界限的控制器转换为具有sublrinear {\ it自适应遗憾}界限的控制器,在时间变化的线性动力学系统的设置中。主要的技术创新是第一个自适应遗憾,即使用内存的在线凸优化的更通用的框架。此外,我们给出了一个下限,表明我们所达到的自适应遗憾束缚在这个一般框架中几乎很紧。
We consider the problem of online control of systems with time-varying linear dynamics. This is a general formulation that is motivated by the use of local linearization in control of nonlinear dynamical systems. To state meaningful guarantees over changing environments, we introduce the metric of {\it adaptive regret} to the field of control. This metric, originally studied in online learning, measures performance in terms of regret against the best policy in hindsight on {\it any interval in time}, and thus captures the adaptation of the controller to changing dynamics. Our main contribution is a novel efficient meta-algorithm: it converts a controller with sublinear regret bounds into one with sublinear {\it adaptive regret} bounds in the setting of time-varying linear dynamical systems. The main technical innovation is the first adaptive regret bound for the more general framework of online convex optimization with memory. Furthermore, we give a lower bound showing that our attained adaptive regret bound is nearly tight for this general framework.