论文标题
文本分类中对抗域适应的标签比例估计技术
A Label Proportions Estimation Technique for Adversarial Domain Adaptation in Text Classification
论文作者
论文摘要
许多文本分类任务都是依赖域的,并且已经提出了各种域的适应方法来预测新域中未标记的数据。域逆境神经网络(DANN)及其变体最近被广泛使用,并为此问题实现了有希望的结果。但是,这些方法中的大多数都假定源域和目标域的标签比例相似,在大多数现实世界中,很少有。有时,标签偏移可能很大,而丹恩(Dann)无法学习域不变特征。在这项研究中,我们专注于使用标签移位对文本分类的无监督域适应,并引入具有标签比例估计(DAN-LPE)框架的域对抗网络。 Dan-lpe同时训练一个域对抗网,并通过源域的混淆和目标域的预测来估算标签比例估计。实验表明,DAN-LPE可以很好地估计目标标签分布并减少标签转移以提高分类性能。
Many text classification tasks are domain-dependent, and various domain adaptation approaches have been proposed to predict unlabeled data in a new domain. Domain-adversarial neural networks (DANN) and their variants have been used widely recently and have achieved promising results for this problem. However, most of these approaches assume that the label proportions of the source and target domains are similar, which rarely holds in most real-world scenarios. Sometimes the label shift can be large and the DANN fails to learn domain-invariant features. In this study, we focus on unsupervised domain adaptation of text classification with label shift and introduce a domain adversarial network with label proportions estimation (DAN-LPE) framework. The DAN-LPE simultaneously trains a domain adversarial net and processes label proportions estimation by the confusion of the source domain and the predictions of the target domain. Experiments show the DAN-LPE achieves a good estimate of the target label distributions and reduces the label shift to improve the classification performance.