查看第一句话：有问题的位置偏见回答

论文标题

查看第一句话：有问题的位置偏见回答

Look at the First Sentence: Position Bias in Question Answering

论文作者

Ko, Miyoung, Lee, Jinhyuk, Kim, Hyunjae, Kim, Gangwoo, Kang, Jaewoo

论文摘要

许多提取问题答案模型都经过培训，以预测答案的开始和最终位置。选择答案作为位置的选择主要是由于其简单性和有效性。在这项研究中，我们假设，当训练集中的答案位置的分布高度偏斜时（例如，答案仅在于每个段落的k-句子）时，QA模型可以预测答案，因为位置可以学习虚假的位置提示，并且无法在不同位置提供答案。我们首先说明了流行的QA模型（例如Bidaf和Bert）中的这种位置偏见，并彻底检查了位置偏见如何通过BERT的每一层传播。为了安全地提供没有位置偏见的位置信息，我们使用各种偏见的方法培训模型，包括熵正则化和偏置结合。其中，我们发现，使用答案位置作为偏见模型的先验分布在减少位置偏差方面非常有效，在通过偏见的小队数据集进行培训时，将BERT的性能从37.48％恢复到81.64％。

Many extractive question answering models are trained to predict start and end positions of answers. The choice of predicting answers as positions is mainly due to its simplicity and effectiveness. In this study, we hypothesize that when the distribution of the answer positions is highly skewed in the training set (e.g., answers lie only in the k-th sentence of each passage), QA models predicting answers as positions can learn spurious positional cues and fail to give answers in different positions. We first illustrate this position bias in popular extractive QA models such as BiDAF and BERT and thoroughly examine how position bias propagates through each layer of BERT. To safely deliver position information without position bias, we train models with various de-biasing methods including entropy regularization and bias ensembling. Among them, we found that using the prior distribution of answer positions as a bias model is very effective at reducing position bias, recovering the performance of BERT from 37.48% to 81.64% when trained on a biased SQuAD dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题