重新思考协作度量学习：迈向有效的替代方案而无需进行负抽样

论文标题

重新思考协作度量学习：迈向有效的替代方案而无需进行负抽样

Rethinking Collaborative Metric Learning: Toward an Efficient Alternative without Negative Sampling

论文作者

Bao, Shilong, Xu, Qianqian, Yang, Zhiyong, Cao, Xiaochun, Huang, Qingming

论文摘要

最近提出的协作度量学习（CML）范式由于其简单性和有效性引起了人们对推荐系统（RS）领域的广泛兴趣。通常，现有的CML文献在很大程度上取决于\ textit {负抽样}策略，以减轻成对计算的耗时负担。但是，在这项工作中，通过进行理论分析，我们发现负抽样会导致对概括误差的偏差估计。具体而言，我们表明，基于抽样的CML将在概括性结合中引入一个偏差项，该术语是由per-use \ textit {total方差}（TV）量化的。这表明，即使有足够大的训练数据，优化基于采样的CML损耗函数也不能确保小概括误差。此外，我们表明偏见术语将消失，而没有负面抽样策略。在此激励的情况下，我们提出了一个有效的替代方案，而没有对CML进行负面抽样的cml，\ textit {无抽样的协作度量学习}（SFCML），以消除实际意义上的采样偏见。最后，超过七个基准数据集的全面实验表达了所提出的算法的优势。

The recently proposed Collaborative Metric Learning (CML) paradigm has aroused wide interest in the area of recommendation systems (RS) owing to its simplicity and effectiveness. Typically, the existing literature of CML depends largely on the \textit{negative sampling} strategy to alleviate the time-consuming burden of pairwise computation. However, in this work, by taking a theoretical analysis, we find that negative sampling would lead to a biased estimation of the generalization error. Specifically, we show that the sampling-based CML would introduce a bias term in the generalization bound, which is quantified by the per-user \textit{Total Variance} (TV) between the distribution induced by negative sampling and the ground truth distribution. This suggests that optimizing the sampling-based CML loss function does not ensure a small generalization error even with sufficiently large training data. Moreover, we show that the bias term will vanish without the negative sampling strategy. Motivated by this, we propose an efficient alternative without negative sampling for CML named \textit{Sampling-Free Collaborative Metric Learning} (SFCML), to get rid of the sampling bias in a practical sense. Finally, comprehensive experiments over seven benchmark datasets speak to the superiority of the proposed algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题