论文标题
基于对抗培训的多源培训无监督的域适应性用于情感分析
Adversarial Training Based Multi-Source Unsupervised Domain Adaptation for Sentiment Analysis
论文作者
论文摘要
多源无监督的域适应性(MS-UDA)用于情感分析(SA)旨在利用多个源域中的有用信息来帮助在没有监督信息的无标记的目标域中进行SA。 MS-UDA的现有算法要么仅利用共享特征,即域不变信息,要么基于NLP中的某些弱假设,例如平滑度假设。为了避免这些问题,我们通过组合源假设来得出良好的目标假设,提出了两个基于SA的多源域适应方法的转移学习框架。第一个框架的关键特征是基于新颖的加权方案基于无监督的域适应框架(WS-UDA),该框架结合了源分类器以直接获取伪标签以直接获得目标实例。第二个框架是基于两阶段训练的无监督域适应框架(2st-uda),它进一步利用了这些伪标签来训练目标私人提取器。重要的是,分配给每个源分类器的权重是基于目标实例和源域之间的关系,该关系通过对抗性训练通过歧视者来衡量。此外,通过相同的歧视者,我们还实现了共享功能和私人功能的分离。两个SA数据集的实验结果证明了我们的框架表现出色,这表现优于无监督的最先进的竞争对手。
Multi-source unsupervised domain adaptation (MS-UDA) for sentiment analysis (SA) aims to leverage useful information in multiple source domains to help do SA in an unlabeled target domain that has no supervised information. Existing algorithms of MS-UDA either only exploit the shared features, i.e., the domain-invariant information, or based on some weak assumption in NLP, e.g., smoothness assumption. To avoid these problems, we propose two transfer learning frameworks based on the multi-source domain adaptation methodology for SA by combining the source hypotheses to derive a good target hypothesis. The key feature of the first framework is a novel Weighting Scheme based Unsupervised Domain Adaptation framework (WS-UDA), which combine the source classifiers to acquire pseudo labels for target instances directly. While the second framework is a Two-Stage Training based Unsupervised Domain Adaptation framework (2ST-UDA), which further exploits these pseudo labels to train a target private extractor. Importantly, the weights assigned to each source classifier are based on the relations between target instances and source domains, which measured by a discriminator through the adversarial training. Furthermore, through the same discriminator, we also fulfill the separation of shared features and private features. Experimental results on two SA datasets demonstrate the promising performance of our frameworks, which outperforms unsupervised state-of-the-art competitors.