长音文档的抽象性和混合摘要

论文标题

长音文档的抽象性和混合摘要

Abstractive and mixed summarization for long-single documents

论文作者

Barrull, Roger, Kalita, Jugal

论文摘要

可用于自动汇总文档的数据集中缺乏多样性，这意味着绝大多数用于自动摘要的神经模型已经接受了新闻文章的培训。这些数据集相对较小，平均大小约为600个单词，并且经过此类数据集训练的模型将其性能限于简短文档。为了克服这个问题，本文将科学论文用作培训不同模型的数据集。这些模型是根据CNN/每日邮件数据集的性能选择的，因此选择了每个体系结构变体的最高排名模型。在这项工作中，比较了六个不同的模型，两个模型与RNN体系结构，一个具有CNN体系结构，两个具有变压器体系结构，另一个具有变压器体系结构与强化学习相结合。这项工作的结果表明，那些使用层次编码器对文档结构建模的模型的性能比其余的更好。

The lack of diversity in the datasets available for automatic summarization of documents has meant that the vast majority of neural models for automatic summarization have been trained with news articles. These datasets are relatively small, with an average size of about 600 words, and the models trained with such data sets see their performance limited to short documents. In order to surmount this problem, this paper uses scientific papers as the dataset on which different models are trained. These models have been chosen based on their performance on the CNN/Daily Mail data set, so that the highest ranked model of each architectural variant is selected. In this work, six different models are compared, two with an RNN architecture, one with a CNN architecture, two with a Transformer architecture and one with a Transformer architecture combined with reinforcement learning. The results from this work show that those models that use a hierarchical encoder to model the structure of the document has a better performance than the rest.

下载PDF全文

下载文献需遵守相关版权规定

论文标题