通过自适应成对标签平滑正规化

论文标题

通过自适应成对标签平滑正规化

Regularization via Adaptive Pairwise Label Smoothing

论文作者

Guo, Hongyu

论文摘要

标签平滑（LS）是改善最先进模型的概括的有效正常化器。对于每个训练样本，LS策略通过将其分布质量分配到非基真正的阶级，旨在惩罚网络产生过度支持的输出分布，从而平滑单热编码的训练信号。本文介绍了一种称为成对标签平滑（PLS）的新型标签平滑技术。 PLS将一对样品作为输入。使用一对基真实标签平滑使PL可以保留两个真相标签之间的相对距离，同时进一步软化真相标签和其他目标之间的相对软化，从而导致模型产生的预测比LS策略少得多。同样，与当前的LS方法不同，通常需要通过交叉验证搜索找到全局平滑分布质量，请自动了解训练过程中每个输入对的分布质量。我们从经验上表明，PL显着胜过LS和基线模型，可达到相对分类误差降低的30％。我们还从视觉上表明，当实现这种准确性时，PLS往往会产生非常低的赢得软马克斯分数。

Label Smoothing (LS) is an effective regularizer to improve the generalization of state-of-the-art deep models. For each training sample the LS strategy smooths the one-hot encoded training signal by distributing its distribution mass over the non ground-truth classes, aiming to penalize the networks from generating overconfident output distributions. This paper introduces a novel label smoothing technique called Pairwise Label Smoothing (PLS). The PLS takes a pair of samples as input. Smoothing with a pair of ground-truth labels enables the PLS to preserve the relative distance between the two truth labels while further soften that between the truth labels and the other targets, resulting in models producing much less confident predictions than the LS strategy. Also, unlike current LS methods, which typically require to find a global smoothing distribution mass through cross-validation search, PLS automatically learns the distribution mass for each input pair during training. We empirically show that PLS significantly outperforms LS and the baseline models, achieving up to 30% of relative classification error reduction. We also visually show that when achieving such accuracy gains the PLS tends to produce very low winning softmax scores.

下载PDF全文

下载文献需遵守相关版权规定

论文标题