论文标题
Hashnwalk:Hash和随机步行基于Hyperdegy流中的异常检测
HashNWalk: Hash and Random Walk Based Anomaly Detection in Hyperedge Streams
论文作者
论文摘要
群体互动的序列,例如电子邮件,在线讨论和共同撰文,无处不在;而且它们自然地表示为一系列Hyperedges。尽管具有广泛的潜在应用,但与图中相比,在超图(即,超果中的集合)中的异常检测几乎没有引起关注。根据我们的实验,虽然诱使减少图形的超图并应用现有的基于图的方法,但要考虑到高刻度的高阶结构值得考虑。我们提出了Hashnwalk,这是一种增量算法,可检测Hyperedges流中的异常。它维护和更新有关流的结构和时间信息的恒定尺寸摘要。使用该摘要(即接近矩阵的形式)Hashnwalk测量每个新的Hypereedge的异常性。 hashnwalk是(a)快速:它在几个小时内接近实时地处理每个高架,数十亿个超息,(b)空间有效:维护的摘要的大小是预定义的常数,(c)有效:它成功地检测到了现实世界中的超graphs中的异常超匹配。
Sequences of group interactions, such as emails, online discussions, and co-authorships, are ubiquitous; and they are naturally represented as a stream of hyperedges. Despite their broad potential applications, anomaly detection in hypergraphs (i.e., sets of hyperedges) has received surprisingly little attention, compared to that in graphs. While it is tempting to reduce hypergraphs to graphs and apply existing graph-based methods, according to our experiments, taking higher-order structures of hypergraphs into consideration is worthwhile. We propose HashNWalk, an incremental algorithm that detects anomalies in a stream of hyperedges. It maintains and updates a constant-size summary of the structural and temporal information about the stream. Using the summary, which is the form of a proximity matrix, HashNWalk measures the anomalousness of each new hyperedge as it appears. HashNWalk is (a) Fast: it processes each hyperedge in near real-time and billions of hyperedges within a few hours, (b) Space Efficient: the size of the maintained summary is a predefined constant, (c) Effective: it successfully detects anomalous hyperedges in real-world hypergraphs.