选择性伪标记，并通过强化学习半监督域适应

论文标题

选择性伪标记，并通过强化学习半监督域适应

Selective Pseudo-Labeling with Reinforcement Learning for Semi-Supervised Domain Adaptation

论文作者

Liu, Bingyu, Guo, Yuhong, Ye, Jieping, Deng, Weihong

论文摘要

最近的域适应方法表明，无监督的域适应性问题令人印象深刻。但是，在目标域具有一些可用的标记实例的半监督域适应性（SSDA）设置中，这些方法可能无法提高性能。受伪标签在域适应性中的有效性的启发，我们提出了一种基于强化学习的选择性伪标记方法，以进行半监督域的适应性。常规伪标记方法很难平衡伪标记数据的正确性和代表性。为了解决这一限制，我们开发了一个深Q学习模型，以选择准确和代表性的伪标记实例。此外，由于较大的利润损失能力在学习判别特征方面的能力很少，我们进一步提出了基本模型培训的新目标损失，以提高其可辨别性。我们提出的方法在SSDA的几个基准数据集上进行了评估，并证明了与所有比较方法的卓越性能。

Recent domain adaptation methods have demonstrated impressive improvement on unsupervised domain adaptation problems. However, in the semi-supervised domain adaptation (SSDA) setting where the target domain has a few labeled instances available, these methods can fail to improve performance. Inspired by the effectiveness of pseudo-labels in domain adaptation, we propose a reinforcement learning based selective pseudo-labeling method for semi-supervised domain adaptation. It is difficult for conventional pseudo-labeling methods to balance the correctness and representativeness of pseudo-labeled data. To address this limitation, we develop a deep Q-learning model to select both accurate and representative pseudo-labeled instances. Moreover, motivated by large margin loss's capacity on learning discriminative features with little data, we further propose a novel target margin loss for our base model training to improve its discriminability. Our proposed method is evaluated on several benchmark datasets for SSDA, and demonstrates superior performance to all the comparison methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题