记忆效率的嵌入建议

论文标题

记忆效率的嵌入建议

Memory-efficient Embedding for Recommendations

论文作者

Zhao, Xiangyu, Liu, Haochen, Liu, Hui, Tang, Jiliang, Guo, Weiwei, Shi, Jun, Wang, Sida, Gao, Huiji, Long, Bo

论文摘要

实用的大规模推荐系统通常包含来自用户，项目，上下文信息及其交互的数千个功能字段。他们中的大多数在经验上将统一维度分配给所有特征字段，这是内存效率低下的。因此，高度希望根据其重要性和可预测性将不同的嵌入尺寸分配给不同的特征字段。由于具有特征分布和神经网络体系结构的嵌入维度之间的大量特征字段以及细微的关系，因此在实用推荐系统中手动分配嵌入维度可能非常困难。为此，我们在本文中提出了一个基于自动的框架（AUTODIM），该框架可以自动以数据驱动的方式选择不同特征字段的维度。具体而言，我们首先提出了一个端到端可区分的框架，该框架可以使用基于Automl的优化算法以柔软而连续的方式计算特征字段的各个维度的权重；然后，我们根据最大权重来得出硬且离散的组件体系结构，并重新训练整个建议框架。我们在基准数据集上进行了广泛的实验，以验证自动IMIM框架的有效性。

Practical large-scale recommender systems usually contain thousands of feature fields from users, items, contextual information, and their interactions. Most of them empirically allocate a unified dimension to all feature fields, which is memory inefficient. Thus it is highly desired to assign different embedding dimensions to different feature fields according to their importance and predictability. Due to the large amounts of feature fields and the nuanced relationship between embedding dimensions with feature distributions and neural network architectures, manually allocating embedding dimensions in practical recommender systems can be very difficult. To this end, we propose an AutoML based framework (AutoDim) in this paper, which can automatically select dimensions for different feature fields in a data-driven fashion. Specifically, we first proposed an end-to-end differentiable framework that can calculate the weights over various dimensions for feature fields in a soft and continuous manner with an AutoML based optimization algorithm; then we derive a hard and discrete embedding component architecture according to the maximal weights and retrain the whole recommender framework. We conduct extensive experiments on benchmark datasets to validate the effectiveness of the AutoDim framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题