论文标题

无限特征选择:基于图的功能过滤方法

Infinite Feature Selection: A Graph-based Feature Filtering Approach

论文作者

Roffo, Giorgio, Melzi, Simone, Castellani, Umberto, Vinciarelli, Alessandro, Cristani, Marco

论文摘要

我们提出了一个过滤功能选择框架,该框架将特征子集视为图中的路径,其中节点是一个特征,边缘指示特征之间的成对(可自定义)关系,并处理相关性和冗余原理。通过两种不同的解释(利用矩阵的功率系列属性和依靠马尔可夫链基本面),我们可以评估任意长度的路径(即特征子集)的值,最终转到Infinite,从中我们将我们的框架无限特征选择(INF-FS)逐出。去无限可以限制选择过程的计算复杂性,并以优雅的方式对特征进行排名,也就是说,考虑到包含特定功能的任何路径(子集)的值。我们还提出了一种简单的无监督策略来削减排名,因此提供了要保留的功能的子集。在实验中,我们分析了具有异质特征的不同设置,总共有11个基准,并与18种广为人知的比较方法进行了比较。结果表明,在几乎任何情况下,即要保留的功能数量固定为先验或子集基数的决定是过程的一部分时,INF-FS的行为都更好。

We propose a filtering feature selection framework that considers subsets of features as paths in a graph, where a node is a feature and an edge indicates pairwise (customizable) relations among features, dealing with relevance and redundancy principles. By two different interpretations (exploiting properties of power series of matrices and relying on Markov chains fundamentals) we can evaluate the values of paths (i.e., feature subsets) of arbitrary lengths, eventually go to infinite, from which we dub our framework Infinite Feature Selection (Inf-FS). Going to infinite allows to constrain the computational complexity of the selection process, and to rank the features in an elegant way, that is, considering the value of any path (subset) containing a particular feature. We also propose a simple unsupervised strategy to cut the ranking, so providing the subset of features to keep. In the experiments, we analyze diverse settings with heterogeneous features, for a total of 11 benchmarks, comparing against 18 widely-known comparative approaches. The results show that Inf-FS behaves better in almost any situation, that is, when the number of features to keep are fixed a priori, or when the decision of the subset cardinality is part of the process.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源