论文标题
具有深度样本学习和局部全球结构一致性的信封不平衡的合奏模型
Envelope imbalanced ensemble model with deep sample learning and local-global structure consistency
论文作者
论文摘要
阶级不平衡问题很重要且具有挑战性。合奏方法由于其有效性而被广泛用于解决此问题。但是,现有的集合方法始终应用于原始样本中,而没有考虑原始样本之间的结构信息。限制将阻止不平衡的学习变得更好。此外,研究表明,样本中的结构信息包括本地和全球结构信息。基于上面的分析,这里提出了一种具有深度样本前启动前网络(DSEN)和局部 - 全球结构一致性机制(LGSCM)的不平衡集成算法来解决问题。此算法可以保证,该算法可以保证高质量的深度封装样本,以考虑本地歧视和全球结构的信息,以帮助您提供信息,从而有助于学习。首先,深层样品包络预网(DSEN)旨在在样品中开采结构信息。然后,局部歧管结构度量(LMSM)和全球结构分布度量度量(GSDM)旨在构建LGSCM,以增强层中样品的分布一致性。接下来,将DSEN和LGSCM放在一起,形成最终的深层样品网络网络(DSEN-LG)。之后,分别将基本分类器应用于深样品的层。最后,通过装袋集合学习机制融合了基本分类器的预测结果。为了证明该方法的有效性,选择了四十四个公共数据集和十多种代表性相关算法进行验证。实验结果表明,该算法明显优于其他不平衡的集合算法。
The class imbalance problem is important and challenging. Ensemble approaches are widely used to tackle this problem because of their effectiveness. However, existing ensemble methods are always applied into original samples, while not considering the structure information among original samples. The limitation will prevent the imbalanced learning from being better. Besides, research shows that the structure information among samples includes local and global structure information. Based on the analysis above, an imbalanced ensemble algorithm with the deep sample pre-envelope network (DSEN) and local-global structure consistency mechanism (LGSCM) is proposed here to solve the problem.This algorithm can guarantee high-quality deep envelope samples for considering the local manifold and global structures information, which is helpful for imbalance learning. First, the deep sample envelope pre-network (DSEN) is designed to mine structure information among samples.Then, the local manifold structure metric (LMSM) and global structure distribution metric (GSDM) are designed to construct LGSCM to enhance distribution consistency of interlayer samples. Next, the DSEN and LGSCM are put together to form the final deep sample envelope network (DSEN-LG). After that, base classifiers are applied on the layers of deep samples respectively.Finally, the predictive results from base classifiers are fused through bagging ensemble learning mechanism. To demonstrate the effectiveness of the proposed method, forty-four public datasets and more than ten representative relevant algorithms are chosen for verification. The experimental results show that the algorithm is significantly better than other imbalanced ensemble algorithms.