论文标题

淘宝的跨域点击率预测的持续转移学习

Continual Transfer Learning for Cross-Domain Click-Through Rate Prediction at Taobao

论文作者

Liu, Lixin, Wang, Yanling, Wang, Tianming, Guan, Dong, Wu, Jiawei, Chen, Jingxu, Xiao, Rong, Zhu, Wenxiang, Fang, Fei

论文摘要

作为世界上最大的电子商务平台之一,陶波的推荐系统(RSS)满足了为数亿客户购物的需求。点击率(CTR)预测是RS的核心组成部分。 TAOBAO的CTR预测中最大的特征之一是存在多个推荐域,其中不同域的尺度差异很大。因此,至关重要的是,对将知识从大域转移到小域以减轻数据稀疏性问题至关重要。但是,提出了现有的跨域CTR预测方法用于静态知识转移,而忽略了现实世界中RSS中的所有域都在不断发展。鉴于此,我们提出了一项必要但新颖的任务,名为持续转移学习(CTL),该任务将知识从随着时间不断发展的源域转移到了随时间不断发展的目标域。在这项工作中,我们提出了一种称为CTNET的简单有效的CTL模型,以解决淘宝连续跨域CTR预测的问题,并且可以有效地训练CTNET。特别是,CTNET认为在行业中的一个重要特征是,模型已经在很长一段时间内一直在训练有素。因此,CTNET旨在充分利用源域和目标域中的所有训练有素的模型参数,以避免丢失历史上获得的知识,并且仅需要逐步的目标域数据来进行培训以确保效率。在淘宝的大量离线实验和在线A/B测试证明了CTNET的效率和有效性。 CTNET现在在Tamobao的推荐系统中在线部署,为数亿活跃用户的主要流量提供服务。

As one of the largest e-commerce platforms in the world, Taobao's recommendation systems (RSs) serve the demands of shopping for hundreds of millions of customers. Click-Through Rate (CTR) prediction is a core component of the RS. One of the biggest characteristics in CTR prediction at Taobao is that there exist multiple recommendation domains where the scales of different domains vary significantly. Therefore, it is crucial to perform cross-domain CTR prediction to transfer knowledge from large domains to small domains to alleviate the data sparsity issue. However, existing cross-domain CTR prediction methods are proposed for static knowledge transfer, ignoring that all domains in real-world RSs are continually time-evolving. In light of this, we present a necessary but novel task named Continual Transfer Learning (CTL), which transfers knowledge from a time-evolving source domain to a time-evolving target domain. In this work, we propose a simple and effective CTL model called CTNet to solve the problem of continual cross-domain CTR prediction at Taobao, and CTNet can be trained efficiently. Particularly, CTNet considers an important characteristic in the industry that models has been continually well-trained for a very long time. So CTNet aims to fully utilize all the well-trained model parameters in both source domain and target domain to avoid losing historically acquired knowledge, and only needs incremental target domain data for training to guarantee efficiency. Extensive offline experiments and online A/B testing at Taobao demonstrate the efficiency and effectiveness of CTNet. CTNet is now deployed online in the recommender systems of Taobao, serving the main traffic of hundreds of millions of active users.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源