叙事不一致的检测

论文标题

叙事不一致的检测

Narrative Incoherence Detection

论文作者

Cai, Deng, Zhang, Yizhe, Huang, Yichen, Lam, Wai, Dolan, Bill

论文摘要

我们提出叙事不一致检测的任务是用于索语间语义理解的新领域：鉴于多句叙事，请确定叙事流程中是否存在任何语义差异。具体来说，我们专注于丢失的句子和不一致的句子检测。尽管设置简单，但由于模型需要理解和分析多句叙事并预测句子级别的不连贯性，但此任务还是具有挑战性的。作为迈向此任务的第一步，我们实现了几个基线，要么直接分析原始文本（\ textit {token-level}）或分析学习的句子表示（\ textit {句子{句子级}）。我们观察到，当输入包含更少的句子时，令牌级建模具有更好的性能，而句子级建模在更长的叙述上的性能更好，并且在效率和灵活性方面具有优势。对大规模数据和辅助句子预测训练的预训练进一步提高了句子级模型的检测性能。

We propose the task of narrative incoherence detection as a new arena for inter-sentential semantic understanding: Given a multi-sentence narrative, decide whether there exist any semantic discrepancies in the narrative flow. Specifically, we focus on the missing sentence and discordant sentence detection. Despite its simple setup, this task is challenging as the model needs to understand and analyze a multi-sentence narrative, and predict incoherence at the sentence level. As an initial step towards this task, we implement several baselines either directly analyzing the raw text (\textit{token-level}) or analyzing learned sentence representations (\textit{sentence-level}). We observe that while token-level modeling has better performance when the input contains fewer sentences, sentence-level modeling performs better on longer narratives and possesses an advantage in efficiency and flexibility. Pre-training on large-scale data and auxiliary sentence prediction training objective further boost the detection performance of the sentence-level model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题