用语义残留嵌入补充词汇检索

论文标题

用语义残留嵌入补充词汇检索

Complementing Lexical Retrieval with Semantic Residual Embedding

论文作者

Gao, Luyu, Dai, Zhuyun, Chen, Tongfei, Fan, Zhen, Van Durme, Benjamin, Callan, Jamie

论文摘要

本文介绍了Clear，这是一种试图补充经典词汇精确匹配模型（例如BM25）的检索模型，并带有来自神经嵌入匹配模型的语义匹配信号。明确明确地将神经嵌入训练以编码语言结构和语义，即词汇检索无法使用一种新颖的基于残留的基于残留的嵌入学习方法来捕获。经验评估证明了与最先进的检索模型相比，明确的优势，并且可以大大提高重新管理管道的端到端准确性和效率。

This paper presents CLEAR, a retrieval model that seeks to complement classical lexical exact-match models such as BM25 with semantic matching signals from a neural embedding matching model. CLEAR explicitly trains the neural embedding to encode language structures and semantics that lexical retrieval fails to capture with a novel residual-based embedding learning method. Empirical evaluations demonstrate the advantages of CLEAR over state-of-the-art retrieval models, and that it can substantially improve the end-to-end accuracy and efficiency of reranking pipelines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题