论文标题

使用Oracle有效的分位数跟踪

Efficient Quantile Tracking Using an Oracle

论文作者

Hammer, Hugo L., Yazidi, Anis, Riegler, Michael A., Rue, Håvard

论文摘要

对于增量分位数估计器,必须仔细设置步长和可能的其他调谐参数。但是,很少关注如何以在线方式设置这些价值。在本文中,我们建议解决此问题的两个新颖程序。 该过程的核心部分是估计当前的跟踪平方误差(MSE)。 MSE在跟踪方差和偏见以及新的和有效的程序中被分解,以估算这些数量。结果表明,可以通过将观测值与分位数估计值相关联,可以跟踪估计偏差。 第一个过程为$ l $ notial估计器的集合用于调谐参数的广泛值,通常约为$ l = 100 $。在每次迭代中,Oracle通过估计的MSE的指导选择最佳估计。第二种方法仅运行$ L = 3 $估算器的集合,因此调整参数的值不时需要调整运行估计器。该过程的记忆脚打印$ 8L $,计算复杂性为每次迭代的$ 8L $。 实验表明该过程高效,并且跟踪分位数,其误差接近理论最佳。 Oracle方法表现最好,但具有更高的计算成本。这些程序进一步应用于大规模的真实生活数据流,并证明了它们的现实世界的适用性。

For incremental quantile estimators the step size and possibly other tuning parameters must be carefully set. However, little attention has been given on how to set these values in an online manner. In this article we suggest two novel procedures that address this issue. The core part of the procedures is to estimate the current tracking mean squared error (MSE). The MSE is decomposed in tracking variance and bias and novel and efficient procedures to estimate these quantities are presented. It is shown that estimation bias can be tracked by associating it with the portion of observations below the quantile estimates. The first procedure runs an ensemble of $L$ quantile estimators for wide range of values of the tuning parameters and typically around $L = 100$. In each iteration an oracle selects the best estimate by the guidance of the estimated MSEs. The second method only runs an ensemble of $L = 3$ estimators and thus the values of the tuning parameters need from time to time to be adjusted for the running estimators. The procedures have a low memory foot print of $8L$ and a computational complexity of $8L$ per iteration. The experiments show that the procedures are highly efficient and track quantiles with an error close to the theoretical optimum. The Oracle approach performs best, but comes with higher computational cost. The procedures were further applied to a massive real-life data stream of tweets and proofed real world applicability of them.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源