论文标题
从标准摘要到新任务及以后:带有多种信息的摘要
From Standard Summarization to New Tasks and Beyond: Summarization with Manifold Information
论文作者
论文摘要
文本摘要是旨在创建原始文档的简短和凝结版本的研究领域,该版本用几句话传达了文档的主要思想。该研究主题已开始吸引大量研究人员的注意,如今,它被视为最有前途的研究领域之一。通常,文本摘要算法旨在使用纯文本文档作为输入,然后输出摘要。但是,在实际应用程序中,大多数数据都不是纯文本格式。取而代之的是,有很多多种信息要摘要,例如基于搜索引擎中查询的网页的摘要,极端的文档(例如,学术论文),对话框历史记录等。在本文中,我们专注于对现实世界应用中这些新的摘要任务和方法的调查。
Text summarization is the research area aiming at creating a short and condensed version of the original document, which conveys the main idea of the document in a few words. This research topic has started to attract the attention of a large community of researchers, and it is nowadays counted as one of the most promising research areas. In general, text summarization algorithms aim at using a plain text document as input and then output a summary. However, in real-world applications, most of the data is not in a plain text format. Instead, there is much manifold information to be summarized, such as the summary for a web page based on a query in the search engine, extreme long document (e.g., academic paper), dialog history and so on. In this paper, we focus on the survey of these new summarization tasks and approaches in the real-world application.