论文标题

参加开头:关于使用双向关注进行提取摘要的研究

Attend to the beginning: A study on using bidirectional attention for extractive summarization

论文作者

Magooda, Ahmed, Marcjan, Cezary

论文摘要

论坛讨论数据的结构和属性与文本数据的通用形式(例如新闻)不同。从此以后,摘要技术反过来又应该利用这种差异,而工艺模型则可以从讨论数据的结构性质中受益。在这项工作中,我们建议参与文档的开头,以提高应用于论坛讨论数据时提取性摘要模型的性能。评估表明,借助双向关注机制,在讨论线程中参与文档的开头(初始注释/帖子)可以引入胭脂分数的一致提升,并在论坛讨论数据集中引入了新的艺术状态(SOTA)Rouge Rouge分数。此外,我们探讨了该假设是否可扩展到其他通用形式的文本数据。我们利用文本早期引入重要信息的趋势,通过参与通用文本数据中的前几句话。评估表明,当使用双向关注的介绍性句子进行介绍性句子,即使应用于更通用的文本数据形式,提高了提取性摘要模型的性能。

Forum discussion data differ in both structure and properties from generic form of textual data such as news. Henceforth, summarization techniques should, in turn, make use of such differences, and craft models that can benefit from the structural nature of discussion data. In this work, we propose attending to the beginning of a document, to improve the performance of extractive summarization models when applied to forum discussion data. Evaluations demonstrated that with the help of bidirectional attention mechanism, attending to the beginning of a document (initial comment/post) in a discussion thread, can introduce a consistent boost in ROUGE scores, as well as introducing a new State Of The Art (SOTA) ROUGE scores on the forum discussions dataset. Additionally, we explored whether this hypothesis is extendable to other generic forms of textual data. We make use of the tendency of introducing important information early in the text, by attending to the first few sentences in generic textual data. Evaluations demonstrated that attending to introductory sentences using bidirectional attention, improves the performance of extractive summarization models when even applied to more generic form of textual data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源