论文标题
随机并不总是公平的:候选人设置不平等和推荐系统中的不平等现象
Random Isn't Always Fair: Candidate Set Imbalance and Exposure Inequality in Recommender Systems
论文作者
论文摘要
传统上,推荐系统通过向用户返回一组项目来运行,并按照与该用户相关的估计顺序排名。近年来,已经开发了依靠随机订购的方法来创建“更公平的”排名,以减少对用户显示的WHO或内容的不平等。完整的随机分组 - 随机订购候选项目,独立于估计相关性 - 在很大程度上被认为是一种基线程序,导致暴露的分布最相等。在行业环境中,推荐系统通常通过两步过程进行操作,在该过程中,首先使用计算廉价方法生产候选项目,然后仅将完整的排名模型应用于这些候选人。 在本文中,我们考虑了第一步不平等的影响,并表明,在第二步中,完全随机分组可能会导致较高的不平等程度,而不是通过估计的相关性得分来确定性排序。鉴于这种观察,我们然后提出了一种简单的后加工算法,以追求减少暴露不平等,当候选人集具有高水平的不平衡和没有时,这既有效。在模拟数据和用于研究推荐系统公平性的常见基准数据集上,我们的方法的功效都在说明。
Traditionally, recommender systems operate by returning a user a set of items, ranked in order of estimated relevance to that user. In recent years, methods relying on stochastic ordering have been developed to create "fairer" rankings that reduce inequality in who or what is shown to users. Complete randomization -- ordering candidate items randomly, independent of estimated relevance -- is largely considered a baseline procedure that results in the most equal distribution of exposure. In industry settings, recommender systems often operate via a two-step process in which candidate items are first produced using computationally inexpensive methods and then a full ranking model is applied only to those candidates. In this paper, we consider the effects of inequality at the first step and show that, paradoxically, complete randomization at the second step can result in a higher degree of inequality relative to deterministic ordering of items by estimated relevance scores. In light of this observation, we then propose a simple post-processing algorithm in pursuit of reducing exposure inequality that works both when candidate sets have a high level of imbalance and when they do not. The efficacy of our method is illustrated on both simulated data and a common benchmark data set used in studying fairness in recommender systems.