论文标题
通过解释使ML模型更公平:limeout的情况
Making ML models fairer through explanations: the case of LimeOut
论文作者
论文摘要
现在,每天都使用算法决策,并基于机器学习(ML)流程可能复杂且有偏见。鉴于有偏见的决定可能对个人或整个社会产生的关键影响,这引起了一些问题。不仅不公平的结果会影响人权,而且还破坏了公众对ML和AI的信任。在本文中,我们基于决策结果解决了ML模型的公平问题,我们展示了“功能辍学”的简单想法,然后是“集合方法”,可以提高模型公平。为了说明,我们将重新审视提议解决“过程公平”的“ limeout”案例,该案例衡量了模型对敏感或歧视性特征的依赖。鉴于分类器,数据集和一组敏感功能,Limeout首先通过使用“石灰说明”来检查其对敏感功能的依赖,从而评估了分类器是否公平。如果认为不公平,则使用Limeout应用功能辍学来获取分类器池。然后将它们合并为一个合奏分类器,该合奏分类器在经验上被证明较少依赖敏感特征而不会损害分类器的准确性。我们在多个数据集和几个最先进的分类器上介绍了不同的实验,这些实验表明,Limeout的分类器不仅可以改善(或至少维护)过程公平性,还可以维护其他公平指标,例如个人和群体公平,平等的机会和人口统计学奇偶校验。
Algorithmic decisions are now being used on a daily basis, and based on Machine Learning (ML) processes that may be complex and biased. This raises several concerns given the critical impact that biased decisions may have on individuals or on society as a whole. Not only unfair outcomes affect human rights, they also undermine public trust in ML and AI. In this paper we address fairness issues of ML models based on decision outcomes, and we show how the simple idea of "feature dropout" followed by an "ensemble approach" can improve model fairness. To illustrate, we will revisit the case of "LimeOut" that was proposed to tackle "process fairness", which measures a model's reliance on sensitive or discriminatory features. Given a classifier, a dataset and a set of sensitive features, LimeOut first assesses whether the classifier is fair by checking its reliance on sensitive features using "Lime explanations". If deemed unfair, LimeOut then applies feature dropout to obtain a pool of classifiers. These are then combined into an ensemble classifier that was empirically shown to be less dependent on sensitive features without compromising the classifier's accuracy. We present different experiments on multiple datasets and several state of the art classifiers, which show that LimeOut's classifiers improve (or at least maintain) not only process fairness but also other fairness metrics such as individual and group fairness, equal opportunity, and demographic parity.