META-AAD：通过深入学习的主动异常检测

论文标题

META-AAD：通过深入学习的主动异常检测

Meta-AAD: Active Anomaly Detection with Deep Reinforcement Learning

论文作者

Zha, Daochen, Lai, Kwei-Herng, Wan, Mingyang, Hu, Xia

论文摘要

高阳性速率是对异常检测算法的长期挑战，尤其是在高措施应用中。为了确定真正的异常情况，在实践中，将采用分析师或域专家在由异常检测系统确定的异常列表中逐个调查顶级实例。该验证程序会产生信息标签，可以利用这些标签来重新排列异常，从而帮助分析师在给定时间预算的情况下发现更多真正的异常情况。已经提出了一些重新排列的策略来近似上述顺序决策过程。具体而言，现有的策略一直集中在使最高实例更有可能基于反馈的情况下。然后，他们贪婪地选择了查询的顶级实例。但是，这些贪婪的策略可能是最佳选择的，因为从长远来看，一些低级的实例可能会更有帮助。在这项工作中，我们提出了使用Meta-Policy（Meta-AAD）进行主动异常检测，该元素是一个新型框架，可以学习一个用于查询选择的元策略。具体而言，元AAD利用深度强化学习来训练元派利，以选择最合适的实例，以明确优化整个查询过程中发现的异常数量。 Meta-AAD很容易部署，因为可以直接将经过训练的Meta-Policy应用于任何新数据集而无需进行进一步调整。在24个基准数据集上进行的广泛实验表明，元AAD显着超过了最先进的重新排序策略和无监督的基准。经验分析表明，受过训练的元元素是可转移的，并且固有地在长期和短期奖励之间取得了平衡。

High false-positive rate is a long-standing challenge for anomaly detection algorithms, especially in high-stake applications. To identify the true anomalies, in practice, analysts or domain experts will be employed to investigate the top instances one by one in a ranked list of anomalies identified by an anomaly detection system. This verification procedure generates informative labels that can be leveraged to re-rank the anomalies so as to help the analyst to discover more true anomalies given a time budget. Some re-ranking strategies have been proposed to approximate the above sequential decision process. Specifically, existing strategies have been focused on making the top instances more likely to be anomalous based on the feedback. Then they greedily select the top-1 instance for query. However, these greedy strategies could be sub-optimal since some low-ranked instances could be more helpful in the long-term. In this work, we propose Active Anomaly Detection with Meta-Policy (Meta-AAD), a novel framework that learns a meta-policy for query selection. Specifically, Meta-AAD leverages deep reinforcement learning to train the meta-policy to select the most proper instance to explicitly optimize the number of discovered anomalies throughout the querying process. Meta-AAD is easy to deploy since a trained meta-policy can be directly applied to any new datasets without further tuning. Extensive experiments on 24 benchmark datasets demonstrate that Meta-AAD significantly outperforms the state-of-the-art re-ranking strategies and the unsupervised baseline. The empirical analysis shows that the trained meta-policy is transferable and inherently achieves a balance between long-term and short-term rewards.

下载PDF全文

下载文献需遵守相关版权规定

论文标题