信息检索的项目标签：基于三方图神经网络的方法

论文标题

信息检索的项目标签：基于三方图神经网络的方法

Item Tagging for Information Retrieval: A Tripartite Graph Neural Network based Approach

论文作者

Mao, Kelong, Xiao, Xi, Zhu, Jieming, Lu, Biao, Tang, Ruiming, He, Xiuqiang

论文摘要

标记已被认为是一种成功的实践，可以提高信息检索（IR）的相关性，尤其是当项目缺乏丰富的文本描述时。对于多标签文本分类或图像注释，已经进行了大量研究。但是，缺乏针对IR的项目标签的已发表工作。直接应用传统的多标签分类模型来进行项目标记，这对于IR中的独特特征的无知而是最佳选择。在这项工作中，我们建议将项目标记作为项目节点和标签节点之间的链接预测问题。为了丰富项目的表示形式，我们利用IR任务中可用的查询日志，并构建查询项目 - 标签三方图。该公式产生了一种使用多种类型的节点和边缘的异质图神经网络的TAGGNN模型。与以前的研究不同，我们还通过主要的双重损失机制在统一框架中优化了完整的标签预测和部分标签的完成案例。开放数据集和工业数据集的实验结果表明，我们的TAGGNN方法的表现优于最先进的多标签分类方法。

Tagging has been recognized as a successful practice to boost relevance matching for information retrieval (IR), especially when items lack rich textual descriptions. A lot of research has been done for either multi-label text categorization or image annotation. However, there is a lack of published work that targets at item tagging specifically for IR. Directly applying a traditional multi-label classification model for item tagging is sub-optimal, due to the ignorance of unique characteristics in IR. In this work, we propose to formulate item tagging as a link prediction problem between item nodes and tag nodes. To enrich the representation of items, we leverage the query logs available in IR tasks, and construct a query-item-tag tripartite graph. This formulation results in a TagGNN model that utilizes heterogeneous graph neural networks with multiple types of nodes and edges. Different from previous research, we also optimize both full tag prediction and partial tag completion cases in a unified framework via a primary-dual loss mechanism. Experimental results on both open and industrial datasets show that our TagGNN approach outperforms the state-of-the-art multi-label classification approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题