通过零发问题的产生改善通过检索

论文标题

通过零发问题的产生改善通过检索

Improving Passage Retrieval with Zero-Shot Question Generation

论文作者

Sachan, Devendra Singh, Lewis, Mike, Joshi, Mandar, Aghajanyan, Armen, Yih, Wen-tau, Pineau, Joelle, Zettlemoyer, Luke

论文摘要

我们提出了一种简单有效的重新排列方法，用于改善公开问答中的通过检索。重新评分以零发出的问题生成模型检索段落，该模型使用预训练的语言模型来计算以检索到的段落为条件的输入问题的概率。该方法可以在任何检索方法（例如基于神经或关键字）的基础上应用，不需要任何域或任务特定的训练（因此，预计将更好地推广到数据分布变化），并且可以在查询和通道之间提供丰富的交叉意见（即必须在问题中解释所有令牌）。当对许多开放域检索数据集进行评估时，我们的重新疗程将强大的无监督检索模型提高了6％-18％的绝对和强有力的监督模型，就TOP-20通道检索准确性而言，高达12％。我们还通过简单地将新的重新级别添加到现有模型中而没有进一步的更改，从而在完整的开放域问题回答中获得新的最新结果。

We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. The re-ranker re-scores retrieved passages with a zero-shot question generation model, which uses a pre-trained language model to compute the probability of the input question conditioned on a retrieved passage. This approach can be applied on top of any retrieval method (e.g. neural or keyword-based), does not require any domain- or task-specific training (and therefore is expected to generalize better to data distribution shifts), and provides rich cross-attention between query and passage (i.e. it must explain every token in the question). When evaluated on a number of open-domain retrieval datasets, our re-ranker improves strong unsupervised retrieval models by 6%-18% absolute and strong supervised models by up to 12% in terms of top-20 passage retrieval accuracy. We also obtain new state-of-the-art results on full open-domain question answering by simply adding the new re-ranker to existing models with no further changes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题