论文标题

嘈杂标签下的强大最佳分类树

Robust Optimal Classification Trees under Noisy Labels

论文作者

Blanco, Víctor, Japón, Alberto, Puerto, Justo

论文摘要

在本文中,我们提出了一种新的方法来构建最佳分类树,该树考虑了训练样本中可能出现嘈杂标签的最佳分类树。我们的方法取决于两个主要要素:(1)分类树的拆分规则旨在最大化应用SVM范式的类之间的分离边距; (2)在建造树木时,允许更改训练样本的一些标签,试图检测标签噪声。考虑并集成在一起以设计所得的最佳分类树。我们为问题提供了混合整数非线性编程公式,适用于使用任何可用的现成的求解器来解决。该模型在从UCI机器学习存储库中获取的一系列标准数据集上进行了分析和测试,显示了我们方法的有效性。

In this paper we propose a novel methodology to construct Optimal Classification Trees that takes into account that noisy labels may occur in the training sample. Our approach rests on two main elements: (1) the splitting rules for the classification trees are designed to maximize the separation margin between classes applying the paradigm of SVM; and (2) some of the labels of the training sample are allowed to be changed during the construction of the tree trying to detect the label noise. Both features are considered and integrated together to design the resulting Optimal Classification Tree. We present a Mixed Integer Non Linear Programming formulation for the problem, suitable to be solved using any of the available off-the-shelf solvers. The model is analyzed and tested on a battery of standard datasets taken from UCI Machine Learning repository, showing the effectiveness of our approach.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源