概念到文本NLG多样性的知情抽样

论文标题

概念到文本NLG多样性的知情抽样

Informed Sampling for Diversity in Concept-to-Text NLG

论文作者

Zhou, Giulio, Lampouras, Gerasimos

论文摘要

语言生成任务的深度学习模型倾向于产生重复的输出。已经提出了各种方法来鼓励解码过程中的词汇多样性，但这通常是为了使产出的流利性和充分性付出代价。在这项工作中，我们建议通过使用模仿学习方法来探索语言生成模型可以可靠地产生的多样性水平来改善这一成本。具体而言，我们使用经过训练的元分类器来扩展解码过程，以区分任何给定时间步中的单词将导致高质量的输出。我们将实验重点放在概念到文本生成上，其中模型对包括输入和输出之间的严格关系，对包含无关的单词敏感。我们的分析表明，在这种情况下，以前的多样性表现不佳，而人类评估表明，我们提出的方法达到了高度的多样性，对产出的流利性和充分性的影响最小。

Deep-learning models for language generation tasks tend to produce repetitive output. Various methods have been proposed to encourage lexical diversity during decoding, but this often comes at a cost to the perceived fluency and adequacy of the output. In this work, we propose to ameliorate this cost by using an Imitation Learning approach to explore the level of diversity that a language generation model can reliably produce. Specifically, we augment the decoding process with a meta-classifier trained to distinguish which words at any given timestep will lead to high-quality output. We focus our experiments on concept-to-text generation where models are sensitive to the inclusion of irrelevant words due to the strict relation between input and output. Our analysis shows that previous methods for diversity underperform in this setting, while human evaluation suggests that our proposed method achieves a high level of diversity with minimal effect to the output's fluency and adequacy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题