斐波那契和k分配递归特征消除

论文标题

斐波那契和k分配递归特征消除

Fibonacci and k-Subsecting Recursive Feature Elimination

论文作者

Brzezinski, Dariusz

论文摘要

功能选择是一项数据挖掘任务，具有加速分类算法，增强模型可理解性并提高学习准确性的潜力。但是，在预测准确性方面找到最佳特征的子集通常在计算上是棘手的。在处理此问题的几种启发式方法中，递归功能消除（RFE）算法从数据挖掘从业人员那里获得了极大的兴趣。在本文中，我们提出了两种受RFE启发的新型算法，称为fibonacci-和ksysectering递归功能消除，它们以对数步骤中的特征去除特征，从而更密集地探索包装的分类器，以使得更有希望的特征子集。在28个高度多维数据集上，在实验上比较了所提出的算法，并在涉及蛋白质数据库的3D电子密度图的实用案例研究中进行了评估。结果表明，斐波那契和k缩放递归特征消除能够选择比标准RFE快得多的子集，同时实现可比的预测性能。

Feature selection is a data mining task with the potential of speeding up classification algorithms, enhancing model comprehensibility, and improving learning accuracy. However, finding a subset of features that is optimal in terms of predictive accuracy is usually computationally intractable. Out of several heuristic approaches to dealing with this problem, the Recursive Feature Elimination (RFE) algorithm has received considerable interest from data mining practitioners. In this paper, we propose two novel algorithms inspired by RFE, called Fibonacci- and k-Subsecting Recursive Feature Elimination, which remove features in logarithmic steps, probing the wrapped classifier more densely for the more promising feature subsets. The proposed algorithms are experimentally compared against RFE on 28 highly multidimensional datasets and evaluated in a practical case study involving 3D electron density maps from the Protein Data Bank. The results show that Fibonacci and k-Subsecting Recursive Feature Elimination are capable of selecting a smaller subset of features much faster than standard RFE, while achieving comparable predictive performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题