调查标签偏见在光束搜索开放式文本生成中

论文标题

调查标签偏见在光束搜索开放式文本生成中

Investigating Label Bias in Beam Search for Open-ended Text Generation

论文作者

Wang, Liang, Liu, Jinlong, Liu, Jingming

论文摘要

梁搜索是许多顺序到序列（SEQ2SEQ）文本生成任务中有效且广泛使用的解码算法。但是，在开放式文本生成中，通常发现梁搜索会产生重复的和通用的文本，基于采样的解码算法（例如Top-K采样和核采样）更为优先。标准SEQ2SEQ模型由于其局部归一化的概率公式而遭受标签偏差。本文提供了一系列经验证据，表明标签偏差是这种退化行为的主要原因。通过结合局部归一化的最大似然估计和全球标准化序列级训练，可以减少标签偏置，而几乎没有牺牲性。为了定量测量标签偏差，我们测试了该模型区分地面文本和一组上下文敏捷的干扰因素的能力。我们对大规模响应生成数据集进行实验。结果表明，从自动和人类评估指标方面，梁搜索可以通过我们的方法产生更多样化和有意义的文本。我们的分析还表明，未来的工作指导指向开放式文本生成的巨大挑战。

Beam search is an effective and widely used decoding algorithm in many sequence-to-sequence (seq2seq) text generation tasks. However, in open-ended text generation, beam search is often found to produce repetitive and generic texts, sampling-based decoding algorithms like top-k sampling and nucleus sampling are more preferred. Standard seq2seq models suffer from label bias due to its locally normalized probability formulation. This paper provides a series of empirical evidence that label bias is a major reason for such degenerate behaviors of beam search. By combining locally normalized maximum likelihood estimation and globally normalized sequence-level training, label bias can be reduced with almost no sacrifice in perplexity. To quantitatively measure label bias, we test the model's ability to discriminate the groundtruth text and a set of context-agnostic distractors. We conduct experiments on large-scale response generation datasets. Results show that beam search can produce more diverse and meaningful texts with our approach, in terms of both automatic and human evaluation metrics. Our analysis also suggests several future working directions towards the grand challenge of open-ended text generation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题