论文标题
从互补标签的来源域学习:理论和算法
Learning from a Complementary-label Source Domain: Theory and Algorithms
论文作者
论文摘要
在无监督的域适应性(UDA)中,对目标域的分类器进行了训练,该分类器通过来自源域中的大量真实标签数据和来自目标域的未标记数据进行训练。但是,在源域中收集完全真实的标签数据是高成本,有时是不可能的。与真实标签相比,补充标签指定了一个模式不属于的类,因此收集互补标签的费用要比收集真正的标签要小。因此,在本文中,我们提出了一种新颖的环境,即源域由互补标签的数据组成,并首先证明了它的理论结合。我们考虑了这种设置的两种情况,一个是源域仅包含互补标签的数据(完全互补的无监管域适应性,CC-UDA),另一个是源域具有大量的互补标签数据和少量的真实标签数据(部分互补的domain domain domain Adainapation,PC-uda)。为此,提出了一个补充标签的对抗网络}(单簧管)来解决CC-UDA和PC-UDA问题。单簧管同时维护两个深网,其中一个集中于对互补标签源数据进行分类,而另一个则要处理源至目标分布的适应性。实验表明,单簧管在手写数字识别和对象识别任务上的表现明显优于一系列合格的基准。
In unsupervised domain adaptation (UDA), a classifier for the target domain is trained with massive true-label data from the source domain and unlabeled data from the target domain. However, collecting fully-true-label data in the source domain is high-cost and sometimes impossible. Compared to the true labels, a complementary label specifies a class that a pattern does not belong to, hence collecting complementary labels would be less laborious than collecting true labels. Thus, in this paper, we propose a novel setting that the source domain is composed of complementary-label data, and a theoretical bound for it is first proved. We consider two cases of this setting, one is that the source domain only contains complementary-label data (completely complementary unsupervised domain adaptation, CC-UDA), and the other is that the source domain has plenty of complementary-label data and a small amount of true-label data (partly complementary unsupervised domain adaptation, PC-UDA). To this end, a complementary label adversarial network} (CLARINET) is proposed to solve CC-UDA and PC-UDA problems. CLARINET maintains two deep networks simultaneously, where one focuses on classifying complementary-label source data and the other takes care of source-to-target distributional adaptation. Experiments show that CLARINET significantly outperforms a series of competent baselines on handwritten-digits-recognition and objects-recognition tasks.