合并多个标签学习的多个集群中心

论文标题

合并多个标签学习的多个集群中心

Incorporating Multiple Cluster Centers for Multi-Label Learning

论文作者

Shu, Senlin, Lv, Fengmao, Yan, Yan, Li, Li, He, Shuo, He, Jun

论文摘要

多标签学习涉及每个实例同时与多个标签关联的问题。大多数现有方法旨在通过利用标签相关性来提高多标签学习的性能。尽管数据增强技术被广泛用于许多机器学习任务，但仍不清楚数据增强是否有助于多标签学习。在本文中，我们建议利用数据增强技术来提高多标签学习的性能。具体而言，我们首先提出了一种新型的数据增强方法，该方法在真实示例上执行聚类并将聚类中心视为虚拟示例，这些虚拟示例自然体现了局部标签的相关性和标记为重要性。然后，在集群假设的激励下，同一集群中的示例应该具有相同的标签，我们提出了一个新颖的正则化项，以弥合真实示例和虚拟示例之间的差距，这可以促进学习功能的局部平稳性。许多现实世界多标签数据集的广泛实验结果清楚地表明，我们所提出的方法的表现优于最先进的方法。

Multi-label learning deals with the problem that each instance is associated with multiple labels simultaneously. Most of the existing approaches aim to improve the performance of multi-label learning by exploiting label correlations. Although the data augmentation technique is widely used in many machine learning tasks, it is still unclear whether data augmentation is helpful to multi-label learning. In this article, we propose to leverage the data augmentation technique to improve the performance of multi-label learning. Specifically, we first propose a novel data augmentation approach that performs clustering on the real examples and treats the cluster centers as virtual examples, and these virtual examples naturally embody the local label correlations and label importances. Then, motivated by the cluster assumption that examples in the same cluster should have the same label, we propose a novel regularization term to bridge the gap between the real examples and virtual examples, which can promote the local smoothness of the learning function. Extensive experimental results on a number of real-world multi-label datasets clearly demonstrate that our proposed approach outperforms the state-of-the-art counterparts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题