论文标题

一种基于社区检测的新型遗传算法,用于特征选择

A Novel Community Detection Based Genetic Algorithm for Feature Selection

论文作者

Rostami, Mehrdad, Berahmand, Kamal, Forouzandeh, Saman

论文摘要

功能的选择是数据挖掘中必不可少的数据预处理阶段。特征选择的核心原理似乎是通过排除几乎没有预测信息以及高度关联的冗余功能来选择可能的特征子集。在过去的几年中,引入了各种元密度方法,以尽可能从高维数据集中消除冗余和无关紧要的特征。当前基于元海拔的方法的主要缺点之一是,它们经常忽略一组选定特征之间的相关性。在本文中,出于特征选择的目的,作者提出了一种基于社区检测的遗传算法,该算法以三个步骤发挥作用。特征相似性是在第一步中计算的。在第二步中,通过社区检测算法将这些功能分类为簇。在第三步中,通过具有新的基于社区的维修操作的遗传算法来挑选功能。根据提出的方法的性能分析了九个基准分类问题。此外,作者还将所提出方法的效率与四种可用算法的发现进行了特征选择的效率。研究结果表明,新方法不断产生提高的分类精度。

The selection of features is an essential data preprocessing stage in data mining. The core principle of feature selection seems to be to pick a subset of possible features by excluding features with almost no predictive information as well as highly associated redundant features. In the past several years, a variety of meta-heuristic methods were introduced to eliminate redundant and irrelevant features as much as possible from high-dimensional datasets. Among the main disadvantages of present meta-heuristic based approaches is that they are often neglecting the correlation between a set of selected features. In this article, for the purpose of feature selection, the authors propose a genetic algorithm based on community detection, which functions in three steps. The feature similarities are calculated in the first step. The features are classified by community detection algorithms into clusters throughout the second step. In the third step, features are picked by a genetic algorithm with a new community-based repair operation. Nine benchmark classification problems were analyzed in terms of the performance of the presented approach. Also, the authors have compared the efficiency of the proposed approach with the findings from four available algorithms for feature selection. The findings indicate that the new approach continuously yields improved classification accuracy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源