论文标题

多元化,可控和钥匙词:新闻多头条新闻的语料库和方法

Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation

论文作者

Liu, Dayiheng, Gong, Yeyun, Fu, Jie, Liu, Wei, Yan, Yu, Shao, Bo, Jiang, Daxin, Lv, Jiancheng, Duan, Nan

论文摘要

新闻标题一代旨在制作一个简短的句子,以吸引读者阅读新闻。一篇新闻文章通常包含多个关键用户,这些键形文字自然可以具有多个合理的头条新闻。但是,大多数现有方法都集中在单个标题生成上。在本文中,我们提出了用用户兴趣的键形式生成多个标题,其主要思想是首先为用户生成多个感兴趣的键形,然后生成多个与键形相关的标题。我们提出了一个多源变压器解码器,该解码器将三个来源作为输入:(a)键形,(b)键形滤波器过滤的文章,以及(c)原始文章以生成相关,高质量和多样的头条。此外,我们提出了一种简单有效的方法,以挖掘新闻文章中感兴趣的钥匙用形式,并构建第一个大规模的钥匙拼keyphrase-Awaine News News Linewebline Corpus,其中包含超过180k的$ <$ <$ <$新闻文章,头条新闻,钥匙件$> $ $> $的三倍。对现实世界数据集的广泛实验比较表明,所提出的方法在质量和多样性方面实现了最先进的结果

News headline generation aims to produce a short sentence to attract readers to read the news. One news article often contains multiple keyphrases that are of interest to different users, which can naturally have multiple reasonable headlines. However, most existing methods focus on the single headline generation. In this paper, we propose generating multiple headlines with keyphrases of user interests, whose main idea is to generate multiple keyphrases of interest to users for the news first, and then generate multiple keyphrase-relevant headlines. We propose a multi-source Transformer decoder, which takes three sources as inputs: (a) keyphrase, (b) keyphrase-filtered article, and (c) original article to generate keyphrase-relevant, high-quality, and diverse headlines. Furthermore, we propose a simple and effective method to mine the keyphrases of interest in the news article and build a first large-scale keyphrase-aware news headline corpus, which contains over 180K aligned triples of $<$news article, headline, keyphrase$>$. Extensive experimental comparisons on the real-world dataset show that the proposed method achieves state-of-the-art results in terms of quality and diversity

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源