论文标题
自适应双探索折衷方案的异常检测
Adaptive Double-Exploration Tradeoff for Outlier Detection
论文作者
论文摘要
我们在异常检测的背景下研究了阈值匪徒问题(TBP)的变体,在此目的是确定奖励超过阈值的异常值。与传统的TBP不同,阈值定义为所有武器的奖励的函数,这是由于识别异常值的标准而动机。学习者需要探索武器的回报以及门槛。我们将这个问题称为“异常检测的双重探索”。我们基于上一轮阈值的估计值,为阈值构建一个自适应更新的置信区间。此外,通过自动交易探索单个武器并探索离群阈值,我们就可以根据样本复杂性提供有效的算法。合成数据集和现实世界数据集的实验结果证明了我们算法的效率。
We study a variant of the thresholding bandit problem (TBP) in the context of outlier detection, where the objective is to identify the outliers whose rewards are above a threshold. Distinct from the traditional TBP, the threshold is defined as a function of the rewards of all the arms, which is motivated by the criterion for identifying outliers. The learner needs to explore the rewards of the arms as well as the threshold. We refer to this problem as "double exploration for outlier detection". We construct an adaptively updated confidence interval for the threshold, based on the estimated value of the threshold in the previous rounds. Furthermore, by automatically trading off exploring the individual arms and exploring the outlier threshold, we provide an efficient algorithm in terms of the sample complexity. Experimental results on both synthetic datasets and real-world datasets demonstrate the efficiency of our algorithm.