私人也不公平：数据失衡对差异隐私中效用和公平性的影响

论文标题

私人也不公平：数据失衡对差异隐私中效用和公平性的影响

Neither Private Nor Fair: Impact of Data Imbalance on Utility and Fairness in Differential Privacy

论文作者

Farrand, Tom, Mireshghallah, Fatemehsadat, Singh, Sahib, Trask, Andrew

论文摘要

由于其性能，在不同领域和行业的深度学习部署日益增长，这依赖于数据和计算的可用性。数据通常是人群来源的，并包含有关其贡献者的敏感信息，这些信息泄漏到了经过培训的模型中。为了获得严格的隐私保证，使用了不同的私人培训机制。但是，最近已经显示，差异隐私会加剧数据中现有的偏见，并对不同数据亚组的准确性产生不同的影响。在本文中，我们旨在在差异私人深度学习中研究这些效果。具体而言，我们旨在研究数据中不同水平的不平衡程度如何影响模型做出的决策的准确性和公平性，鉴于不同级别的隐私。我们证明，即使是微小的失衡和宽松的隐私保证也会造成不同的影响。

Deployment of deep learning in different fields and industries is growing day by day due to its performance, which relies on the availability of data and compute. Data is often crowd-sourced and contains sensitive information about its contributors, which leaks into models that are trained on it. To achieve rigorous privacy guarantees, differentially private training mechanisms are used. However, it has recently been shown that differential privacy can exacerbate existing biases in the data and have disparate impacts on the accuracy of different subgroups of data. In this paper, we aim to study these effects within differentially private deep learning. Specifically, we aim to study how different levels of imbalance in the data affect the accuracy and the fairness of the decisions made by the model, given different levels of privacy. We demonstrate that even small imbalances and loose privacy guarantees can cause disparate impacts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题