论文标题

快速特征选择与公平限制

Fast Feature Selection with Fairness Constraints

论文作者

Quinzan, Francesco, Khanna, Rajiv, Hershcovitch, Moshik, Cohen, Sarel, Waddington, Daniel G., Friedrich, Tobias, Mahoney, Michael W.

论文摘要

我们研究了为模型构建选择最佳特征的基本问题。即使使用贪婪算法变体,这个问题在大型数据集上在计算上具有挑战性。为了应对这一挑战,我们将最近提出的针对贪婪的向前选择的自适应查询模型扩展到了非平衡函数的正交匹配追求的更快范围。所提出的算法在自适应查询模型中实现了指数快速的并行运行时间,比以前的工作要缩放得多。此外,我们的扩展程序允许使用向下关闭的约束,可用于将某些公平标准编码到特征选择过程中。我们证明了基于标准假设的算法可确保算法的近似值。这些保证适用于许多参数模型,包括通用线性模型。最后,我们从经验上证明,所提出的算法在现实世界和合成数据集上与最先进的功能选择技术竞争。

We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. To address this challenge, we extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions. The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work. Furthermore, our extension allows the use of downward-closed constraints, which can be used to encode certain fairness criteria into the feature selection process. We prove strong approximation guarantees for the algorithm based on standard assumptions. These guarantees are applicable to many parametric models, including Generalized Linear Models. Finally, we demonstrate empirically that the proposed algorithm competes favorably with state-of-the-art techniques for feature selection, on real-world and synthetic datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源