论文标题
在线推荐的生成性逆深强化学习
Generative Inverse Deep Reinforcement Learning for Online Recommendation
论文作者
论文摘要
深度强化学习使代理商通过动态与环境的互动来捕获用户的兴趣。它对推荐研究引起了极大的兴趣。深度强化学习使用奖励功能来学习用户的兴趣并控制学习过程。但是,大多数奖励功能都是手动设计的。它们要么是不现实的,要么不精确地反映了推荐问题的高品种,维度和非线性特性。这使得代理很难学习最佳政策来生成最令人满意的建议。为了解决上述问题,我们提出了一种新颖的生成逆增强学习方法,即InvRec,该方法从用户的行为自动提取奖励功能,以供在线推荐。我们在在线平台,VirtualTB上进行实验,并与几种最先进的方法进行比较,以证明我们提出的方法的可行性和有效性。
Deep reinforcement learning enables an agent to capture user's interest through interactions with the environment dynamically. It has attracted great interest in the recommendation research. Deep reinforcement learning uses a reward function to learn user's interest and to control the learning process. However, most reward functions are manually designed; they are either unrealistic or imprecise to reflect the high variety, dimensionality, and non-linearity properties of the recommendation problem. That makes it difficult for the agent to learn an optimal policy to generate the most satisfactory recommendations. To address the above issue, we propose a novel generative inverse reinforcement learning approach, namely InvRec, which extracts the reward function from user's behaviors automatically, for online recommendation. We conduct experiments on an online platform, VirtualTB, and compare with several state-of-the-art methods to demonstrate the feasibility and effectiveness of our proposed approach.