使用自然语言推理评估数据对文本的语义准确性

论文标题

使用自然语言推理评估数据对文本的语义准确性

Evaluating Semantic Accuracy of Data-to-Text Generation with Natural Language Inference

论文作者

Dušek, Ondřej, Kasner, Zdeněk

论文摘要

评估数据到文本（D2T）生成的主要挑战是测量生成的文本的语义准确性，即检查输出文本是否包含输入数据支持的全部和唯一事实。我们提出了一个新的度量标准，用于评估基于自然语言推断（NLI）的神经模型（NLI）的神经模型的语义准确性。我们使用NLI模型在两个方向上检查输入数据和输出文本之间的文本需要，从而使我们能够揭示遗漏或幻觉。输入数据使用琐碎的模板转换为NLI的文本。我们对最近两个D2T数据集的实验表明，我们的指标可以在识别错误的系统输出方面具有很高的精度。

A major challenge in evaluating data-to-text (D2T) generation is measuring the semantic accuracy of the generated text, i.e. checking if the output text contains all and only facts supported by the input data. We propose a new metric for evaluating the semantic accuracy of D2T generation based on a neural model pretrained for natural language inference (NLI). We use the NLI model to check textual entailment between the input data and the output text in both directions, allowing us to reveal omissions or hallucinations. Input data are converted to text for NLI using trivial templates. Our experiments on two recent D2T datasets show that our metric can achieve high accuracy in identifying erroneous system outputs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题