论文标题
转移Q学习
Transferred Q-learning
论文作者
论文摘要
我们考虑使用来自目标加固学习(RL)任务的样本以及来自不同但相关的RL任务的源样本,考虑使用知识转移的$ Q $ - 学习。我们建议通过离线源研究进行批处理和在线$ Q $的转移学习算法。拟议的转移$ q $ - 学习算法包含一个新颖的重新定位步骤,该步骤可以使RL任务中的多个步骤沿多个步骤进行垂直信息,除了通常的水平信息收集作为转移学习(TL)以进行监督学习。我们通过在离线RL传输中显示出$ Q $函数估计的融合率更快,并且在某些相似性假设下,在离线到Online RL转移中的较低遗憾中,我们建立了TL的第一个理论理由。提供了合成数据集和实际数据集的经验证据,以支持提出的算法和我们的理论结果。
We consider $Q$-learning with knowledge transfer, using samples from a target reinforcement learning (RL) task as well as source samples from different but related RL tasks. We propose transfer learning algorithms for both batch and online $Q$-learning with offline source studies. The proposed transferred $Q$-learning algorithm contains a novel re-targeting step that enables vertical information-cascading along multiple steps in an RL task, besides the usual horizontal information-gathering as transfer learning (TL) for supervised learning. We establish the first theoretical justifications of TL in RL tasks by showing a faster rate of convergence of the $Q$ function estimation in the offline RL transfer, and a lower regret bound in the offline-to-online RL transfer under certain similarity assumptions. Empirical evidences from both synthetic and real datasets are presented to back up the proposed algorithm and our theoretical results.