论文标题

查询级别的早期出口,以添加学习到级的合奏

Query-level Early Exit for Additive Learning-to-Rank Ensembles

论文作者

Lucchese, Claudio, Nardini, Franco Maria, Orlando, Salvatore, Perego, Raffaele, Trani, Salvatore

论文摘要

搜索引擎排名管道通常是基于机器学习决策树的大型组合。最近,对查询响应时间的严格限制最近激发了研究人员研究算法,以使添加剂合奏的遍历更快,或者提早终止对文档的评估,而文档不太可能被排名在顶级K中。在本文中,我们调查了\ textit {查询级别的早期退出}的新颖问题,旨在确定早期停止排名集合的遍历的盈利能力,以使所有候选文档都可以通过基于添加性分数返回的排名来计算,从而使所有候选文档得分为查询。除了在查询潜伏期和吞吐量方面具有明显的优势外,我们还解决了查询级别早期退出对排名有效性的可能积极影响。为此,我们研究了树集合的增量部分对给定查询评分的顶级文档的排名。我们的主要发现是,由于在合奏的遍历期间积累了分数,查询表现出不同的行为,并且查询级别的早期停止可以显着提高排名质量。我们提出了在两个公共数据集上进行的可再现和全面的实验评估,这表明查询级别的早期退出的总体增长在NDCG@10方面的总增长高达7.5%,并加快了高达2.2倍的评分过程。

Search engine ranking pipelines are commonly based on large ensembles of machine-learned decision trees. The tight constraints on query response time recently motivated researchers to investigate algorithms to make faster the traversal of the additive ensemble or to early terminate the evaluation of documents that are unlikely to be ranked among the top-k. In this paper, we investigate the novel problem of \textit{query-level early exiting}, aimed at deciding the profitability of early stopping the traversal of the ranking ensemble for all the candidate documents to be scored for a query, by simply returning a ranking based on the additive scores computed by a limited portion of the ensemble. Besides the obvious advantage on query latency and throughput, we address the possible positive impact of query-level early exiting on ranking effectiveness. To this end, we study the actual contribution of incremental portions of the tree ensemble to the ranking of the top-k documents scored for a given query. Our main finding is that queries exhibit different behaviors as scores are accumulated during the traversal of the ensemble and that query-level early stopping can remarkably improve ranking quality. We present a reproducible and comprehensive experimental evaluation, conducted on two public datasets, showing that query-level early exiting achieves an overall gain of up to 7.5% in terms of NDCG@10 with a speedup of the scoring process of up to 2.2x.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源