论文标题
线性动力学系统中快速稳定的增强学习
Reinforcement Learning with Fast Stabilization in Linear Dynamical Systems
论文作者
论文摘要
在这项工作中,我们研究了基于模型的强化学习(RL),在未知可稳定的线性动力学系统中。学习动态系统时,需要稳定未知动态,以避免系统爆炸。我们提出了一种算法,该算法通过通过改进的勘探策略有效地探索环境来证明基础系统的快速稳定。我们表明,所提出的算法达到$ \ tilde {\ Mathcal {o}}(\ sqrt {t})$遗憾,$ t $ t $ t $ t $ t $ t $ t的时间步长的时间步骤 - 环境交互。我们还表明,所提出的算法的遗憾在问题维度中仅具有多项式依赖性,这对先前方法给出了指数的改进。我们改进的探索方法简单而高效,它将RL中的复杂探索政策与各向同性探索策略结合在一起,以实现快速稳定和改善的遗憾。我们从经验上证明,所提出的算法在几种自适应控制任务中的其他流行方法优于其他流行方法。
In this work, we study model-based reinforcement learning (RL) in unknown stabilizable linear dynamical systems. When learning a dynamical system, one needs to stabilize the unknown dynamics in order to avoid system blow-ups. We propose an algorithm that certifies fast stabilization of the underlying system by effectively exploring the environment with an improved exploration strategy. We show that the proposed algorithm attains $\tilde{\mathcal{O}}(\sqrt{T})$ regret after $T$ time steps of agent-environment interaction. We also show that the regret of the proposed algorithm has only a polynomial dependence in the problem dimensions, which gives an exponential improvement over the prior methods. Our improved exploration method is simple, yet efficient, and it combines a sophisticated exploration policy in RL with an isotropic exploration strategy to achieve fast stabilization and improved regret. We empirically demonstrate that the proposed algorithm outperforms other popular methods in several adaptive control tasks.