公平性的正常化：一种简单的标准化技术，用于回归机器学习问题中的公平性

论文标题

公平性的正常化：一种简单的标准化技术，用于回归机器学习问题中的公平性

Normalise for Fairness: A Simple Normalisation Technique for Fairness in Regression Machine Learning Problems

论文作者

Amin, Mostafa M., Schuller, Björn W.

论文摘要

算法和机器学习（ML）越来越多地影响日常生活和几个决策过程，其中ML由于可伸缩性或卓越的性能而具有优势。这种应用中的公平性至关重要，在这种应用中，模型不应根据种族，性别或其他受保护的群体来区分其结果。这对于影响非常敏感主题的模型（例如访谈邀请或累犯预测）尤其重要。与二进制分类问题相比，通常没有研究回归问题的公平性；因此，我们提出了一种基于标准化（Faireg）的简单而有效的方法，该方法最大程度地减少了不公平性在回归问题中的影响，尤其是由于标记偏差而引起的。除经验比较与公平性的两种标准方法（即数据平衡和对抗性训练）外，我们还提供了该方法的理论分析。我们还包括一个混合配方（FaireGH），将提出的方法与数据平衡合并，以试图同时面对标记和采样偏见。实验是在具有各种标签的多模式数据集第一印象（FI）上进行的，即五五个人格预测和访谈筛选评分。结果表明，与数据平衡相比，降低不公平性的影响的出色表现也不会使原始问题的性能和对抗性训练一样多。公平性是根据均等精度（EA）和统计奇偶校验（SP）约束评估的。实验提出了一个设置，该设置同时增强了几个受保护变量的公平性。

Algorithms and Machine Learning (ML) are increasingly affecting everyday life and several decision-making processes, where ML has an advantage due to scalability or superior performance. Fairness in such applications is crucial, where models should not discriminate their results based on race, gender, or other protected groups. This is especially crucial for models affecting very sensitive topics, like interview invitation or recidivism prediction. Fairness is not commonly studied for regression problems compared to binary classification problems; hence, we present a simple, yet effective method based on normalisation (FaiReg), which minimises the impact of unfairness in regression problems, especially due to labelling bias. We present a theoretical analysis of the method, in addition to an empirical comparison against two standard methods for fairness, namely data balancing and adversarial training. We also include a hybrid formulation (FaiRegH), merging the presented method with data balancing, in an attempt to face labelling and sampling biases simultaneously. The experiments are conducted on the multimodal dataset First Impressions (FI) with various labels, namely Big-Five personality prediction and interview screening score. The results show the superior performance of diminishing the effects of unfairness better than data balancing, also without deteriorating the performance of the original problem as much as adversarial training. Fairness is evaluated based on the Equal Accuracy (EA) and Statistical Parity (SP) constraints. The experiments present a setup that enhances the fairness for several protected variables simultaneously.

下载PDF全文

下载文献需遵守相关版权规定

论文标题