利用模型固有的可变重要性用于稳定的在线功能选择

论文标题

利用模型固有的可变重要性用于稳定的在线功能选择

Leveraging Model Inherent Variable Importance for Stable Online Feature Selection

论文作者

Haug, Johannes, Pawelczyk, Martin, Broelemann, Klaus, Kasneci, Gjergji

论文摘要

特征选择可能是获得可靠和准确预测的关键因素。但是，在线特征选择模型在相当大的限制下运行；他们需要根据一组有界的观测值有效提取显着的输入特征，同时实现可靠和准确的预测。在这项工作中，我们介绍了火灾，这是一个新颖的在线功能选择框架。提出的特征加权机制利用了预测模型参数中固有的重要性信息。通过将模型参数视为随机变量，我们可以对具有高不确定性的特征进行惩罚，从而产生更稳定的特征集。我们的框架是通用的，因为它将基础模型选择给用户。令人惊讶的是，实验表明，模型复杂性对所选特征集的判别能力和稳定性只有很小的影响。实际上，使用简单的线性模型，FIRES获得了与最新方法竞争的功能集，同时大大减少了计算时间。此外，实验表明，在特征选择稳定性方面，提出的框架显然是优越的。

Feature selection can be a crucial factor in obtaining robust and accurate predictions. Online feature selection models, however, operate under considerable restrictions; they need to efficiently extract salient input features based on a bounded set of observations, while enabling robust and accurate predictions. In this work, we introduce FIRES, a novel framework for online feature selection. The proposed feature weighting mechanism leverages the importance information inherent in the parameters of a predictive model. By treating model parameters as random variables, we can penalize features with high uncertainty and thus generate more stable feature sets. Our framework is generic in that it leaves the choice of the underlying model to the user. Strikingly, experiments suggest that the model complexity has only a minor effect on the discriminative power and stability of the selected feature sets. In fact, using a simple linear model, FIRES obtains feature sets that compete with state-of-the-art methods, while dramatically reducing computation time. In addition, experiments show that the proposed framework is clearly superior in terms of feature selection stability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题