论文标题
RNE:十亿个建议的可扩展网络嵌入
RNE: A Scalable Network Embedding for Billion-scale Recommendation
论文作者
论文摘要
如今,设计真正的推荐系统对于学术和行业来说都是一个关键问题。但是,由于用户和项目的数量大量,用户兴趣的多样性和动态属性,如何设计可扩展的推荐系统,该系统能够有效地在数十亿个尺度的情况下产生有效而多样的推荐结果,仍然是现有方法的挑战性和开放性问题。在本文中,鉴于用户互动图,我们提出了一种基于数据推荐的网络嵌入方法RNE,以向用户提供个性化和多样化的项目。具体而言,我们提出了一种用于网络嵌入的多样性和动态感知的邻居抽样方法。一方面,该方法能够保留用户和项目之间的本地结构,同时建模用户兴趣的多样性和动态属性以提高建议质量。另一方面,采样方法可以从理论上降低整个方法的复杂性,以使数十亿级建议成为可能。我们还以分布式方式实施了设计算法,以进一步提高其可扩展性。在实验上,我们在中国最大的电子商务平台的推荐方案中部署了RNE,并在十亿个尺度的用户项目图上进行训练。如A/B测试的几个在线指标所示,与基于CF的方法相比,RNE能够实现高质量和不同的结果。我们还在Pinterest数据集上进行了离线实验,与几种最先进的建议方法和网络嵌入方法相比。结果表明,我们的方法能够产生良好的结果,而运行速度比基线方法快得多。
Nowadays designing a real recommendation system has been a critical problem for both academic and industry. However, due to the huge number of users and items, the diversity and dynamic property of the user interest, how to design a scalable recommendation system, which is able to efficiently produce effective and diverse recommendation results on billion-scale scenarios, is still a challenging and open problem for existing methods. In this paper, given the user-item interaction graph, we propose RNE, a data-efficient Recommendation-based Network Embedding method, to give personalized and diverse items to users. Specifically, we propose a diversity- and dynamics-aware neighbor sampling method for network embedding. On the one hand, the method is able to preserve the local structure between the users and items while modeling the diversity and dynamic property of the user interest to boost the recommendation quality. On the other hand the sampling method can reduce the complexity of the whole method theoretically to make it possible for billion-scale recommendation. We also implement the designed algorithm in a distributed way to further improves its scalability. Experimentally, we deploy RNE on a recommendation scenario of Taobao, the largest E-commerce platform in China, and train it on a billion-scale user-item graph. As is shown on several online metrics on A/B testing, RNE is able to achieve both high-quality and diverse results compared with CF-based methods. We also conduct the offline experiments on Pinterest dataset comparing with several state-of-the-art recommendation methods and network embedding methods. The results demonstrate that our method is able to produce a good result while runs much faster than the baseline methods.