使用梯度方差估算示例难度

论文标题

使用梯度方差估算示例难度

Estimating Example Difficulty Using Variance of Gradients

论文作者

Agarwal, Chirag, D'souza, Daniel, Hooker, Sara

论文摘要

在机器学习中，一个极大的兴趣问题是了解哪些示例对于模型进行分类是有挑战性的。确定非典型示例可确保模型的安全部署，隔离需要进一步检查的样本，并为模型行为提供解释性。在这项工作中，我们提出梯度（VOG）的差异为有价值且有效的指标，以通过难度对数据进行排名，并呈现出最具挑战性的人类审计的最具挑战性示例的可行子集。我们表明，对于模型而言，具有较高VOG分数的数据点要在损坏或记忆的示例上学习和过度指数。此外，将评估限制为具有最低VOG的测试集实例，可以改善模型的泛化性能。最后，我们证明VOG是分布外检测的有价值，有效的排名。

In machine learning, a question of great interest is understanding what examples are challenging for a model to classify. Identifying atypical examples ensures the safe deployment of models, isolates samples that require further human inspection and provides interpretability into model behavior. In this work, we propose Variance of Gradients (VoG) as a valuable and efficient metric to rank data by difficulty and to surface a tractable subset of the most challenging examples for human-in-the-loop auditing. We show that data points with high VoG scores are far more difficult for the model to learn and over-index on corrupted or memorized examples. Further, restricting the evaluation to the test set instances with the lowest VoG improves the model's generalization performance. Finally, we show that VoG is a valuable and efficient ranking for out-of-distribution detection.

下载PDF全文

下载文献需遵守相关版权规定

论文标题