论文标题
直接潜在模型学习可以解决线性二次高斯控制吗?
Can Direct Latent Model Learning Solve Linear Quadratic Gaussian Control?
论文作者
论文摘要
我们研究了从潜在的高维观测中学习状态表示的任务,目的是控制一个未知的部分可观察的系统。我们采用一种直接的潜在模型学习方法,其中通过预测与计划直接相关的数量(例如,成本)而无需重建观察值,可以学习某些潜在状态空间中的动态模型。特别是,我们专注于一种直观成本驱动的状态表示方法,用于解决线性二次高斯(LQG)控制,这是最基本的部分可观察到的控制问题之一。作为我们的主要结果,我们建立了使用直接学习的潜在模型找到近乎最佳状态表示函数和近乎最佳控制器的有限样本保证。据我们所知,尽管取得了各种经验成功,但在这项工作之前,尚不清楚这样一个成本驱动的潜在模型学习者是否拥有有限的样本保证。我们的工作强调了预测多步骤成本的价值,这是我们理论的关键的想法,尤其是一个众所周知的想法,对于学习状态表征具有经验上有价值。
We study the task of learning state representations from potentially high-dimensional observations, with the goal of controlling an unknown partially observable system. We pursue a direct latent model learning approach, where a dynamic model in some latent state space is learned by predicting quantities directly related to planning (e.g., costs) without reconstructing the observations. In particular, we focus on an intuitive cost-driven state representation learning method for solving Linear Quadratic Gaussian (LQG) control, one of the most fundamental partially observable control problems. As our main results, we establish finite-sample guarantees of finding a near-optimal state representation function and a near-optimal controller using the directly learned latent model. To the best of our knowledge, despite various empirical successes, prior to this work it was unclear if such a cost-driven latent model learner enjoys finite-sample guarantees. Our work underscores the value of predicting multi-step costs, an idea that is key to our theory, and notably also an idea that is known to be empirically valuable for learning state representations.