论文标题
印章:细分划分提取性抽取长篇文本摘要
SEAL: Segment-wise Extractive-Abstractive Long-form Text Summarization
论文作者
论文摘要
序列到序列范式中的大多数先前工作都集中在数百个令牌中的输入序列长度的数据集上,这是由于公共RNN和变压器体系结构的计算约束。在本文中,我们研究了长形式的抽象文本摘要,这是一个序列到序列的设置,输入序列长度高达100,000个令牌和输出序列长度高达768个令牌。我们提出了一个基于变压器的模型SEAL,其特征是新的编码器描述器的注意力,该密封器动态提取/选择输入片段以稀少地为每个输出段进行。仅使用原始文档和摘要,我们得出了代理标签,这些标签可与抽象摘要的定期监督同时提供弱监督。密封模型可以在现有的长格式摘要任务上实现最新的结果,并且在我们介绍的新数据集/任务上胜过强大的基线模型,search2wiki具有更长的输入文本。由于密封模型中的内容选择是明确的,因此可以检查选择的理想副作用以提高可解释性。
Most prior work in the sequence-to-sequence paradigm focused on datasets with input sequence lengths in the hundreds of tokens due to the computational constraints of common RNN and Transformer architectures. In this paper, we study long-form abstractive text summarization, a sequence-to-sequence setting with input sequence lengths up to 100,000 tokens and output sequence lengths up to 768 tokens. We propose SEAL, a Transformer-based model, featuring a new encoder-decoder attention that dynamically extracts/selects input snippets to sparsely attend to for each output segment. Using only the original documents and summaries, we derive proxy labels that provide weak supervision for extractive layers simultaneously with regular supervision from abstractive summaries. The SEAL model achieves state-of-the-art results on existing long-form summarization tasks, and outperforms strong baseline models on a new dataset/task we introduce, Search2Wiki, with much longer input text. Since content selection is explicit in the SEAL model, a desirable side effect is that the selection can be inspected for enhanced interpretability.