论文标题

从非典型行为中学习:临时兴趣意识到基于强化学习的建议

Learning from Atypical Behavior: Temporary Interest Aware Recommendation Based on Reinforcement Learning

论文作者

Du, Ziwen, Yang, Ning, Yu, Zhonghua, Yu, Philip S.

论文摘要

传统的强大建议方法将非典型的用户项目互动视为噪音,并旨在通过某种噪音过滤技术来减少其影响,这通常面临两个挑战。首先,在现实世界中,非典型的互动可能表明用户的临时兴趣与他们的一般偏好不同。因此,只需滤除非典型的相互作用,因为噪声可能是不合适的,并降低了建议的个性化。其次,由于没有明确的监督信号来指示互动是否非典型,因此很难获得临时利息。为了应对这一挑战,我们提出了一个名为“临时兴趣意识推荐”(TIAREC)的新型模型,该模型可以将非典型相互作用与普通相互作用区分开,而无需监督并捕获临时兴趣以及用户的一般偏好。特别是,我们提出了一个加固学习框架,其中包含推荐代理和辅助分类器代理,其共同培训,目的是最大程度地提高推荐代理商提出的建议的累积回报。在联合培训过程中,分类器代理可以判断推荐代理推荐的项目的互动是否是非典型的,并且可以将学习来自非典型互动的临时兴趣的知识转移到推荐代理上,这使得能够单独使用的推荐代理可以使建议平衡用户的一般偏好和临时利益。最后,在现实世界数据集上进行的实验验证了头饰的有效性。

Traditional robust recommendation methods view atypical user-item interactions as noise and aim to reduce their impact with some kind of noise filtering technique, which often suffers from two challenges. First, in real world, atypical interactions may signal users' temporary interest different from their general preference. Therefore, simply filtering out the atypical interactions as noise may be inappropriate and degrade the personalization of recommendations. Second, it is hard to acquire the temporary interest since there are no explicit supervision signals to indicate whether an interaction is atypical or not. To address this challenges, we propose a novel model called Temporary Interest Aware Recommendation (TIARec), which can distinguish atypical interactions from normal ones without supervision and capture the temporary interest as well as the general preference of users. Particularly, we propose a reinforcement learning framework containing a recommender agent and an auxiliary classifier agent, which are jointly trained with the objective of maximizing the cumulative return of the recommendations made by the recommender agent. During the joint training process, the classifier agent can judge whether the interaction with an item recommended by the recommender agent is atypical, and the knowledge about learning temporary interest from atypical interactions can be transferred to the recommender agent, which makes the recommender agent able to alone make recommendations that balance the general preference and temporary interest of users. At last, the experiments conducted on real world datasets verify the effectiveness of TIARec.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源