论文标题
调查释义生成在弗兰克质量检查系统中的问题重新印度的使用
Investigating the use of Paraphrase Generation for Question Reformulation in the FRANK QA system
论文作者
论文摘要
我们介绍了一项研究术语生成方法增加自然语言问题的能力的研究,坦率的问题回答系统可以回答。我们首先使用自动指标和人类判断力评估LC-Quad 2.0数据集上的释义方法,并讨论其相关性。还使用自动和手动方法对数据集进行了错误分析,我们讨论了释义的生成和评估如何受包含错误的数据点的影响。然后,我们模拟了最佳性能释义方法(一种英语 - 法译本)的实现,以使用一个小的挑战数据集,以测试我们的原始假设。我们的两个主要结论是需要清洁LC-Quad 2.0,因为存在的错误会影响评估。而且,由于弗兰克解析器的局限性,释义的产生不是我们可以依靠的方法来改善弗兰克可以回答的各种自然语言问题。
We present a study into the ability of paraphrase generation methods to increase the variety of natural language questions that the FRANK Question Answering system can answer. We first evaluate paraphrase generation methods on the LC-QuAD 2.0 dataset using both automatic metrics and human judgement, and discuss their correlation. Error analysis on the dataset is also performed using both automatic and manual approaches, and we discuss how paraphrase generation and evaluation is affected by data points which contain error. We then simulate an implementation of the best performing paraphrase generation method (an English-French backtranslation) into FRANK in order to test our original hypothesis, using a small challenge dataset. Our two main conclusions are that cleaning of LC-QuAD 2.0 is required as the errors present can affect evaluation; and that, due to limitations of FRANK's parser, paraphrase generation is not a method which we can rely on to improve the variety of natural language questions that FRANK can answer.