论文标题
机器学习公平
Fairness in Machine Learning
论文作者
论文摘要
基于机器学习的系统正在整个日常生活的整个社会到达社会。这种现象伴随着人们对采用这些技术可能引起的道德问题的担忧。 ML公平是最近确定的机器学习领域,研究如何确保数据和模型不准确的偏见不会导致模型,这些模型会基于特征(例如种族,性别,残疾以及性或政治取向。在本手稿中,我们讨论了当前有关公平性和处理方法的方法中存在的一些局限性,并描述了作者为解决这些问题所做的一些工作。更具体地说,我们展示了因果贝叶斯网络如何发挥重要作用来推理和处理公平,尤其是在复杂的不公平情况下。我们描述了如何使用最佳运输理论来开发对与不同敏感属性相对应的分布的完整形状施加限制的方法,从而克服了大多数方法的限制,这些方法通过对这些分布的低阶或其他功能强加约束来构成公平性的限制。我们提出了一个统一的框架,该框架涵盖了可以处理不同的设置和公平标准的方法,并且具有强大的理论保证。我们介绍了一种学习公平表示的方法,可以推广到看不见的任务。最后,我们描述了一种说明使用敏感属性的法律限制的技术。
Machine learning based systems are reaching society at large and in many aspects of everyday life. This phenomenon has been accompanied by concerns about the ethical issues that may arise from the adoption of these technologies. ML fairness is a recently established area of machine learning that studies how to ensure that biases in the data and model inaccuracies do not lead to models that treat individuals unfavorably on the basis of characteristics such as e.g. race, gender, disabilities, and sexual or political orientation. In this manuscript, we discuss some of the limitations present in the current reasoning about fairness and in methods that deal with it, and describe some work done by the authors to address them. More specifically, we show how causal Bayesian networks can play an important role to reason about and deal with fairness, especially in complex unfairness scenarios. We describe how optimal transport theory can be used to develop methods that impose constraints on the full shapes of distributions corresponding to different sensitive attributes, overcoming the limitation of most approaches that approximate fairness desiderata by imposing constraints on the lower order moments or other functions of those distributions. We present a unified framework that encompasses methods that can deal with different settings and fairness criteria, and that enjoys strong theoretical guarantees. We introduce an approach to learn fair representations that can generalize to unseen tasks. Finally, we describe a technique that accounts for legal restrictions about the use of sensitive attributes.