预测灵敏度：部署分类器中反事实公平性的持续审核

论文标题

预测灵敏度：部署分类器中反事实公平性的持续审核

Prediction Sensitivity: Continual Audit of Counterfactual Fairness in Deployed Classifiers

论文作者

Maughan, Krystal, Ngong, Ivoline C., Near, Joseph P.

论文摘要

随着基于AI的系统越来越多地影响我们生活的许多领域，审核这些系统以公平性是一个越来越高的问题。传统的集体公平指标可能会错过对个人的歧视，并且在部署后很难应用。反事实公平描述了一个个性化的公平概念，但在部署后评估更具挑战性。我们提出了预测敏感性，这是对部署分类器中反事实公平性的持续审核的方法。预测敏感性有助于回答以下问题：如果该人属于另一个人群组，那么对于部署模型做出的每个预测，该预测会有所不同。预测灵敏度可以利用受保护状态和其他功能之间的相关性，并且在预测时不需要受保护的状态信息。我们的经验结果表明，预测敏感性对于检测违反反事实公平的行为有效。

As AI-based systems increasingly impact many areas of our lives, auditing these systems for fairness is an increasingly high-stakes problem. Traditional group fairness metrics can miss discrimination against individuals and are difficult to apply after deployment. Counterfactual fairness describes an individualized notion of fairness but is even more challenging to evaluate after deployment. We present prediction sensitivity, an approach for continual audit of counterfactual fairness in deployed classifiers. Prediction sensitivity helps answer the question: would this prediction have been different, if this individual had belonged to a different demographic group -- for every prediction made by the deployed model. Prediction sensitivity can leverage correlations between protected status and other features and does not require protected status information at prediction time. Our empirical results demonstrate that prediction sensitivity is effective for detecting violations of counterfactual fairness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题