E-SNLI-VE：校正的视觉文本含义与自然语言解释

论文标题

E-SNLI-VE：校正的视觉文本含义与自然语言解释

e-SNLI-VE: Corrected Visual-Textual Entailment with Natural Language Explanations

论文作者

Do, Virginie, Camburu, Oana-Maria, Akata, Zeynep, Lukasiewicz, Thomas

论文摘要

最近提出的用于识别视觉文本元素的SNLI-VE语料库是用于细粒度多模式推理的大型现实世界数据集。但是，组装SNLI-VE的自动方式（通过组合两个相关数据集的部分）在此语料库的标签中导致大量错误。在本文中，我们首先提出数据收集工作，以纠正SNLI-VE中最高错误率的类。其次，我们在校正的语料库上重新评估了现有模型，我们称之为SNLI-VE-2.0，并与其在未校正的语料库上的性能进行了定量比较。第三，我们介绍了E-SNLI-VE，它将人写的自然语言解释附加到SNLI-VE-2.0。最后，我们训练在培训时从这些解释中学习的模型，并在测试时输出此类解释。

The recently proposed SNLI-VE corpus for recognising visual-textual entailment is a large, real-world dataset for fine-grained multimodal reasoning. However, the automatic way in which SNLI-VE has been assembled (via combining parts of two related datasets) gives rise to a large number of errors in the labels of this corpus. In this paper, we first present a data collection effort to correct the class with the highest error rate in SNLI-VE. Secondly, we re-evaluate an existing model on the corrected corpus, which we call SNLI-VE-2.0, and provide a quantitative comparison with its performance on the non-corrected corpus. Thirdly, we introduce e-SNLI-VE, which appends human-written natural language explanations to SNLI-VE-2.0. Finally, we train models that learn from these explanations at training time, and output such explanations at testing time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题