歧视：硬样品的普遍损失和不正确的样品歧视

论文标题

歧视：硬样品的普遍损失和不正确的样品歧视

DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples Discrimination

论文作者

Wu, Tingting, Ding, Xiao, Zhang, Hao, Gao, Jinglong, Du, Li, Qin, Bing, Liu, Ting

论文摘要

给定标签噪声的数据（即数据不正确），深神经网络将逐渐记住标签噪声并损害模型性能。为了减轻此问题，提出了课程学习，以通过在有意义的（例如，易于硬）序列中订购培训样本来提高模型性能和概括。以前的工作将错误的样本作为通用的硬性样本，而无需区分硬样品（即正确数据中的硬样品）和不正确的样本。确实，模型应该从硬样本中学习，以促进概括而不是过度拟合错误的样本。在本文中，我们通过在现有的任务损失之上附加新颖的损失函数Indimloss来解决此问题。它的主要影响是在训练的早期阶段自动而稳定地估计简易样品和困难样本（包括硬和不正确的样本）的重要性，以改善模型性能。然后，在以下阶段中，歧视致力于区分硬性和不正确的样本以改善模型的概括。这样的培训策略可以以自我监督的方式动态制定，从而有效地模仿课程学习的主要原则。关于图像分类，图像回归，文本序列回归和事件关系推理的实验证明了我们方法的多功能性和有效性，尤其是在存在多样化的噪声水平的情况下。

Given data with label noise (i.e., incorrect data), deep neural networks would gradually memorize the label noise and impair model performance. To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful (e.g., easy to hard) sequence. Previous work takes incorrect samples as generic hard ones without discriminating between hard samples (i.e., hard samples in correct data) and incorrect samples. Indeed, a model should learn from hard samples to promote generalization rather than overfit to incorrect ones. In this paper, we address this problem by appending a novel loss function DiscrimLoss, on top of the existing task loss. Its main effect is to automatically and stably estimate the importance of easy samples and difficult samples (including hard and incorrect samples) at the early stages of training to improve the model performance. Then, during the following stages, DiscrimLoss is dedicated to discriminating between hard and incorrect samples to improve the model generalization. Such a training strategy can be formulated dynamically in a self-supervised manner, effectively mimicking the main principle of curriculum learning. Experiments on image classification, image regression, text sequence regression, and event relation reasoning demonstrate the versatility and effectiveness of our method, particularly in the presence of diversified noise levels.

下载PDF全文

下载文献需遵守相关版权规定

论文标题