D-BIAS：一种基于因果关系的人类在环境系统中，用于应对算法偏见

论文标题

D-BIAS：一种基于因果关系的人类在环境系统中，用于应对算法偏见

D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias

论文作者

Ghai, Bhavya, Mueller, Klaus

论文摘要

随着人工智能的兴起，算法已经变得更好地从培训数据中学习潜在的模式，包括基于性别，种族等基于性别的社会偏见。部署这种算法对诸如雇用，医疗保健，执法等领域等领域的部署都引起了人们对公平性，问责，信任，信任，信任，信任，信任，信任，机器学习的能力。为了减轻这个问题，我们提出了D-Bias，这是一种视觉互动工具，它体现了人类在循环AI方法，用于审核和减轻表格数据集中的社交偏见。它使用图形因果模型来表示数据集中不同特征之间的因果关系，并作为注入域知识的媒介。用户可以通过识别因果网络中的不公平因果关系并使用一系列公平指标来检测对群体的偏见，例如女性或子组（例如黑人女性）。此后，用户可以通过在不公平的因果边缘作用来减轻偏见。对于每次相互作用，例如弱化/删除有偏见的因果边缘，系统使用一种新方法来模拟基于当前因果模型的新（demiased）数据集。用户可以在视觉上评估其相互作用对不同公平指标，公用事业指标，数据失真和基础数据分布的影响。一旦满足，他们就可以下载辩护的数据集并将其用于任何下游应用程序以进行更公正的预测。我们通过对3个数据集进行实验以及一项正式的用户研究来评估D偏差。我们发现，与不同公平指标的基线偏差方法相比，D偏差有助于显着降低偏差，同时几乎没有数据失真和效用少损失。此外，我们基于人类的方法极大地超过了关于信任，解释性和问责制的自动方法。

With the rise of AI, algorithms have become better at learning underlying patterns from the training data including ingrained social biases based on gender, race, etc. Deployment of such algorithms to domains such as hiring, healthcare, law enforcement, etc. has raised serious concerns about fairness, accountability, trust and interpretability in machine learning algorithms. To alleviate this problem, we propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases from tabular datasets. It uses a graphical causal model to represent causal relationships among different features in the dataset and as a medium to inject domain knowledge. A user can detect the presence of bias against a group, say females, or a subgroup, say black females, by identifying unfair causal relationships in the causal network and using an array of fairness metrics. Thereafter, the user can mitigate bias by acting on the unfair causal edges. For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset based on the current causal model. Users can visually assess the impact of their interactions on different fairness metrics, utility metrics, data distortion, and the underlying data distribution. Once satisfied, they can download the debiased dataset and use it for any downstream application for fairer predictions. We evaluate D-BIAS by conducting experiments on 3 datasets and also a formal user study. We found that D-BIAS helps reduce bias significantly compared to the baseline debiasing approach across different fairness metrics while incurring little data distortion and a small loss in utility. Moreover, our human-in-the-loop based approach significantly outperforms an automated approach on trust, interpretability and accountability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题