通过基于参与者的受控感应对异常的时间检测

论文标题

通过基于参与者的受控感应对异常的时间检测

Temporal Detection of Anomalies via Actor-Critic Based Controlled Sensing

论文作者

Joseph, Geethu, Gursoy, M. Cenk, Varshney, Pramod K.

论文摘要

我们解决了监视一组二进制随机过程并生成警报的问题，当它们之间的异常数量超过阈值时。为此，决策者选择并探究过程的子集，以获取其状态（正常或异常）的嘈杂估计值。根据收到的观察结果，决策者首先确定是否宣布异常数量已超过阈值或继续进行观察。当决定继续下去时，它会决定是在下一次即时收集观察结果还是将其推迟到以后的时间。如果选择收集观测值，它将进一步确定要探测的过程的子集。为了设计这个三步的顺序决策过程，我们使用贝叶斯公式，其中我们了解过程状态的后验概率。使用后验概率，我们构建了马尔可夫的决策过程，并使用深层参与者的批判性强化学习来解决它。通过数值实验，我们证明了与传统基于模型的算法相比，我们的算法的出色性能。

We address the problem of monitoring a set of binary stochastic processes and generating an alert when the number of anomalies among them exceeds a threshold. For this, the decision-maker selects and probes a subset of the processes to obtain noisy estimates of their states (normal or anomalous). Based on the received observations, the decisionmaker first determines whether to declare that the number of anomalies has exceeded the threshold or to continue taking observations. When the decision is to continue, it then decides whether to collect observations at the next time instant or defer it to a later time. If it chooses to collect observations, it further determines the subset of processes to be probed. To devise this three-step sequential decision-making process, we use a Bayesian formulation wherein we learn the posterior probability on the states of the processes. Using the posterior probability, we construct a Markov decision process and solve it using deep actor-critic reinforcement learning. Via numerical experiments, we demonstrate the superior performance of our algorithm compared to the traditional model-based algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题