论文标题
在随机假设下选择的PU学习的风险范围
Risk bounds for PU learning under Selected At Random assumption
论文作者
论文摘要
积极的未标记学习(PU学习)被称为半监督二进制分类的特殊情况,其中仅标记了一小部分积极示例。尽管缺乏信息,但挑战是要找到正确的分类器。最近,已经引入了新的方法来解决标记可能取决于协变量的概率的情况。在本文中,我们有兴趣在这个一般假设下建立PU学习的风险范围。此外,与标准分类设置相比,我们量化了标签噪声对PU学习的影响。最后,我们提供了Minimax风险的下限,证明上限几乎是最佳的。
Positive-unlabeled learning (PU learning) is known as a special case of semi-supervised binary classification where only a fraction of positive examples are labeled. The challenge is then to find the correct classifier despite this lack of information. Recently, new methodologies have been introduced to address the case where the probability of being labeled may depend on the covariates. In this paper, we are interested in establishing risk bounds for PU learning under this general assumption. In addition, we quantify the impact of label noise on PU learning compared to standard classification setting. Finally, we provide a lower bound on minimax risk proving that the upper bound is almost optimal.