从失败中学习：培训偏见的分类器的培训分类器

论文标题

从失败中学习：培训偏见的分类器的培训分类器

Learning from Failure: Training Debiased Classifier from Biased Classifier

论文作者

Nam, Junhyun, Cha, Hyuntak, Ahn, Sungsoo, Lee, Jaeho, Shin, Jinwoo

论文摘要

神经网络通常会学会做出预测，这些预测过于依赖于数据集中存在的虚假相关性，这导致模型存在偏见。尽管以前的工作通过在虚假相关的属性上使用明确的标签或假定特定的偏见类型来解决此问题，但我们使用一种更便宜但通用的人类知识形式，这可以广泛适用于各种类型的偏见。我们首先观察到，只有在学习比所需的知识更容易学习时，神经网络才能学会依靠虚假相关性，而在培训的早期阶段，这种依赖最为突出。基于观察结果，我们通过同时训练一对神经网络提出了一种基于失败的辩论方案。我们的主要想法是双重的。（a）我们故意训练第一个网络通过反复扩大其“偏见”来偏向偏见，并且（b）（b）我们通过重点关注（a）中有偏见网络的偏见的样本来使第二个网络的培训对第二个网络进行培训。广泛的实验表明，我们的方法显着改善了针对合成和实际数据集中各种偏见的网络训练。令人惊讶的是，我们的框架甚至偶尔都要优于需要明确监督的偏见方法。

Neural networks often learn to make predictions that overly rely on spurious correlation existing in the dataset, which causes the model to be biased. While previous work tackles this issue by using explicit labeling on the spuriously correlated attributes or presuming a particular bias type, we instead utilize a cheaper, yet generic form of human knowledge, which can be widely applicable to various types of bias. We first observe that neural networks learn to rely on the spurious correlation only when it is "easier" to learn than the desired knowledge, and such reliance is most prominent during the early phase of training. Based on the observations, we propose a failure-based debiasing scheme by training a pair of neural networks simultaneously. Our main idea is twofold; (a) we intentionally train the first network to be biased by repeatedly amplifying its "prejudice", and (b) we debias the training of the second network by focusing on samples that go against the prejudice of the biased network in (a). Extensive experiments demonstrate that our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets. Surprisingly, our framework even occasionally outperforms the debiasing methods requiring explicit supervision of the spuriously correlated attributes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题