论文标题
增强公平:一种可解释的模型增强决策者的公平性
Augmented Fairness: An Interpretable Model Augmenting Decision-Makers' Fairness
论文作者
论文摘要
我们提出了一种模型方法,用于减轻黑盒决策者的预测偏见,尤其是人类决策者。我们的方法在特征空间中检测到黑箱决策者有偏见,并用一些简短的决策规则代替,并用作“公平的代理”。基于规则的替代模型在两个目标(预测性能和公平性)下进行了培训。我们的模型专注于在实践中常见但与其他公平文献不同的环境。我们只能对模型进行黑箱访问,并且只能在预算限制下查询一组有限的真实标签。我们为构建替代模型制定多目标优化,同时对预测性能和偏见进行优化。为了训练模型,我们提出了一种新型的培训算法,该算法将非主导的分类遗传算法与主动学习结合在一起。我们在公共数据集上测试了我们的模型,在该数据集中,我们模拟了各种有偏见的“黑箱”分类器(决策者),并将我们的方法应用于可解释的增强公平性。
We propose a model-agnostic approach for mitigating the prediction bias of a black-box decision-maker, and in particular, a human decision-maker. Our method detects in the feature space where the black-box decision-maker is biased and replaces it with a few short decision rules, acting as a "fair surrogate". The rule-based surrogate model is trained under two objectives, predictive performance and fairness. Our model focuses on a setting that is common in practice but distinct from other literature on fairness. We only have black-box access to the model, and only a limited set of true labels can be queried under a budget constraint. We formulate a multi-objective optimization for building a surrogate model, where we simultaneously optimize for both predictive performance and bias. To train the model, we propose a novel training algorithm that combines a nondominated sorting genetic algorithm with active learning. We test our model on public datasets where we simulate various biased "black-box" classifiers (decision-makers) and apply our approach for interpretable augmented fairness.