论文标题
在线进行反事实:高效且公正的在线评估排名
Taking the Counterfactual Online: Efficient and Unbiased Online Evaluation for Ranking
论文作者
论文摘要
反事实评估可以估计基于历史交互数据的排名系统之间的点击率率(CTR)差异,同时减轻位置偏差和项目选择偏差的影响。我们介绍了新颖的记录 - 策略优化算法(Logopt),该算法优化了记录数据的策略,以使反事实估计值的差异很小。随着最小化差异会导致更快的收敛性,LogOPT增加了反事实估计的数据效率。 logopt将反事实方法(对记录策略无动于衷)转变为一种在线方法,在该方法中,该算法决定显示哪些排名。我们证明,作为一种在线评估方法,Logopt是无偏的W.R.T.与现有的交织方法不同,位置和项目选择偏差。此外,我们通过模拟数千个排名之间的比较来执行大规模实验。我们的结果表明,虽然交错方法会导致系统错误,但logopt虽然在没有偏见的情况下与交织一样有效。
Counterfactual evaluation can estimate Click-Through-Rate (CTR) differences between ranking systems based on historical interaction data, while mitigating the effect of position bias and item-selection bias. We introduce the novel Logging-Policy Optimization Algorithm (LogOpt), which optimizes the policy for logging data so that the counterfactual estimate has minimal variance. As minimizing variance leads to faster convergence, LogOpt increases the data-efficiency of counterfactual estimation. LogOpt turns the counterfactual approach - which is indifferent to the logging policy - into an online approach, where the algorithm decides what rankings to display. We prove that, as an online evaluation method, LogOpt is unbiased w.r.t. position and item-selection bias, unlike existing interleaving methods. Furthermore, we perform large-scale experiments by simulating comparisons between thousands of rankers. Our results show that while interleaving methods make systematic errors, LogOpt is as efficient as interleaving without being biased.