论文标题
更好的自我关键序列训练的变体
A Better Variant of Self-Critical Sequence Training
论文作者
论文摘要
在这项工作中,我们提出了一个简单而又更好的自我序列序列训练的变体。我们对加强算法的基线功能的选择进行了简单的改变。与贪婪的解码基线相比,新的基线可以带来更好的性能,而无需额外的成本。
In this work, we present a simple yet better variant of Self-Critical Sequence Training. We make a simple change in the choice of baseline function in REINFORCE algorithm. The new baseline can bring better performance with no extra cost, compared to the greedy decoding baseline.