论文标题
公平的解释代表学习与校正向量
Fair Interpretable Representation Learning with Correction Vectors
论文作者
论文摘要
神经网络体系结构已在公平表示学习环境中广泛使用,其目的是学习独立于敏感信息的给定向量的新表示。文献中已经提出了各种代表性的证词技术。但是,由于神经网络本质上是不透明的,因此这些方法很难理解,这限制了它们的实用性。我们为公平表示学习提供了一个新的框架,该框架围绕“校正向量”的学习,该学习具有与给定数据向量相同的维度。校正向量可以通过架构约束明确计算,也可以通过基于归一化流量的可逆模型进行隐式计算。我们从实验上表明,以这种方式约束的几个公平表示学习模型不会在排名或分类绩效中表现出损失。此外,我们证明了可逆模型可以实现最新结果。最后,根据欧盟的最新立法,我们讨论了方法论的法律地位。
Neural network architectures have been extensively employed in the fair representation learning setting, where the objective is to learn a new representation for a given vector which is independent of sensitive information. Various representation debiasing techniques have been proposed in the literature. However, as neural networks are inherently opaque, these methods are hard to comprehend, which limits their usefulness. We propose a new framework for fair representation learning that is centered around the learning of "correction vectors", which have the same dimensionality as the given data vectors. Correction vectors may be computed either explicitly via architectural constraints or implicitly by training an invertible model based on Normalizing Flows. We show experimentally that several fair representation learning models constrained in such a way do not exhibit losses in ranking or classification performance. Furthermore, we demonstrate that state-of-the-art results can be achieved by the invertible model. Finally, we discuss the law standing of our methodology in light of recent legislation in the European Union.