论文标题
通过有效抽样有效地找到最大集团摘要
Efficiently Finding a Maximal Clique Summary via Effective Sampling
论文作者
论文摘要
最大集团枚举(MCE)是图理论中的一个基本问题,用于许多应用程序,例如社交网络分析,生物信息学,智能代理系统,网络安全等。大多数现有的MCE算法都专注于提高效率而不是降低输出尺寸。不幸的是,输出可能包括大量最大集团。在本文中,我们研究了如何报告较少重叠的最大集团的摘要。但是,在研究了开拓者的方法之后,我们认为这个问题是不令人满意的。为了沿着这条线进行研究,我们的论文试图做出四个贡献:(a)我们提出了一种更有效的抽样策略,该策略产生了较小的摘要,但仍然确保摘要可以某种程度上可以见证所有最大集团的所有最大集团以及摘要所见证的每个最大概括的期望,都在预定的theheshold上方; (b)我们证明在某些最佳条件下采样策略是最佳的; (c)我们应用集团大小的边界并设计新的枚举订单来应对最佳条件; (d)为了通过实验验证,我们测试了具有多种图形特征的八个真实基准数据集。结果表明,我们的新采样策略通过产生较小的摘要并在所有数据集上运行速度更快地超过最先进的方法。
Maximal clique enumeration (MCE) is a fundamental problem in graph theory and is used in many applications, such as social network analysis, bioinformatics, intelligent agent systems, cyber security, etc. Most existing MCE algorithms focus on improving the efficiency rather than reducing the output size. The output unfortunately could consist of a large number of maximal cliques. In this paper, we study how to report a summary of less overlapping maximal cliques. The problem was studied before, however, after examining the pioneer approach, we consider it still not satisfactory. To advance the research along this line, our paper attempts to make four contributions: (a) we propose a more effective sampling strategy, which produces a much smaller summary but still ensures that the summary can somehow witness all the maximal cliques and the expectation of each maximal clique witnessed by the summary is above a predefined threshold; (b) we prove that the sampling strategy is optimal under certain optimality conditions; (c) we apply clique-size bounding and design new enumeration order to approach the optimality conditions; and (d) to verify experimentally, we test eight real benchmark datasets that have a variety of graph characteristics. The results show that our new sampling strategy consistently outperforms the state-of-the-art approach by producing smaller summaries and running faster on all the datasets.