模型嵌入基于模型的增强学习

论文标题

模型嵌入基于模型的增强学习

Model Embedding Model-Based Reinforcement Learning

论文作者

Tan, Xiaoyu, Qu, Chao, Xiong, Junwu, Zhang, James

论文摘要

基于模型的增强学习（MBRL）表明，其在样品效率上的优势比无模型的增强学习（MFRL）。尽管取得了令人印象深刻的结果，但它仍然面临数据生成和模型偏见之间的权衡。在本文中，我们提出了一种基于模型模型的简单型模型增强学习（MEMB）算法，该算法在概率增强学习的框架中。为了平衡样本效率和模型偏差，我们利用培训中的真实数据和虚构数据。特别是，我们将模型嵌入到策略更新中，并从真实数据集中学习$ Q $和$ V $函数。我们通过Lipschitz的连续性假设在模型和政策上提供了MEMB的理论分析。最后，我们在几个基准上评估了MEMB，并证明我们的算法可以实现最新的性能。

Model-based reinforcement learning (MBRL) has shown its advantages in sample-efficiency over model-free reinforcement learning (MFRL). Despite the impressive results it achieves, it still faces a trade-off between the ease of data generation and model bias. In this paper, we propose a simple and elegant model-embedding model-based reinforcement learning (MEMB) algorithm in the framework of the probabilistic reinforcement learning. To balance the sample-efficiency and model bias, we exploit both real and imaginary data in the training. In particular, we embed the model in the policy update and learn $Q$ and $V$ functions from the real data set. We provide the theoretical analysis of MEMB with the Lipschitz continuity assumption on the model and policy. At last, we evaluate MEMB on several benchmarks and demonstrate our algorithm can achieve state-of-the-art performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题