文本检索的稀疏，密集和注意力

论文标题

文本检索的稀疏，密集和注意力

Sparse, Dense, and Attentional Representations for Text Retrieval

论文作者

Luan, Yi, Eisenstein, Jacob, Toutanova, Kristina, Collins, Michael

论文摘要

双重编码器通过将文档和查询编码为密集的低维矢量来进行检索，并通过查询通过其内部产品对每个文档进行评分。我们研究了这种体系结构相对于稀疏词袋模型和注意力神经网络的能力。使用理论和经验分析，我们在编码维度，黄金和较低文档之间的边距以及文档长度之间建立了连接，这表明固定长度编码能力以支持长期文档的精确检索。在这些见解的基础上，我们提出了一个简单的神经模型，将双重编码器的效率与更昂贵的注意力架构的某些表现力相结合，并探索稀疏的密集杂种，以利用稀疏检索的精确度。这些模型在大规模检索中的表现优于强大的替代方案。

Dual encoders perform retrieval by encoding documents and queries into dense lowdimensional vectors, scoring each document by its inner product with the query. We investigate the capacity of this architecture relative to sparse bag-of-words models and attentional neural networks. Using both theoretical and empirical analysis, we establish connections between the encoding dimension, the margin between gold and lower-ranked documents, and the document length, suggesting limitations in the capacity of fixed-length encodings to support precise retrieval of long documents. Building on these insights, we propose a simple neural model that combines the efficiency of dual encoders with some of the expressiveness of more costly attentional architectures, and explore sparse-dense hybrids to capitalize on the precision of sparse retrieval. These models outperform strong alternatives in large-scale retrieval.

下载PDF全文

下载文献需遵守相关版权规定

论文标题