论文标题
在大型现实世界中列出最大k-plexes
Listing Maximal k-Plexes in Large Real-World Graphs
论文作者
论文摘要
在大图中列出密集的子图在网络分析应用程序等各种社区检测等品种中起关键任务。作为最密集的模型,集团已广泛研究。但是,实际上,由于各种原因,例如数据噪声,社区很少形成为集团。因此,$ k $ - plex, - 除了最多$ k $顶点(最多$ k $ dertices)附近的每个顶点的图形都被引入了一个轻松的clique版本。通常,为了更好地模拟凝聚力的社区,重点是与小$ k $的连接$ k $ - 杂物。在本文中,我们将继续列出所有最大$ k $ plexes和Maximal $ k $ - 开处方尺寸的研究行。我们的第一个贡献是算法ListPlex,它列出了每个常数$ k $的所有最大$ k $ - plexes in $ o^*(γ^d)$时间,其中$γ$的值与$ k $相关,但严格小于2 $,而$ d $是该图的脱发,远小于vertex number $ n $ n $ n $ n $ n $ n $ n $ n $ n $ n $ n $。与$ 2^n $的微不足道相比,改进是显着的,我们的界限比所有以前已知的结果都要好。实际上,我们进一步使用多种技术来加速给定尺寸的$ k $ plexes,例如基于结构性的修剪规则,缓存有效的数据结构和并行技术。所有这些共同产生了非常实用的算法。经验结果表明,我们的方法的表现优于最先进的解决方案。
Listing dense subgraphs in large graphs plays a key task in varieties of network analysis applications like community detection. Clique, as the densest model, has been widely investigated. However, in practice, communities rarely form as cliques for various reasons, e.g., data noise. Therefore, $k$-plex, -- graph with each vertex adjacent to all but at most $k$ vertices, is introduced as a relaxed version of clique. Often, to better simulate cohesive communities, an emphasis is placed on connected $k$-plexes with small $k$. In this paper, we continue the research line of listing all maximal $k$-plexes and maximal $k$-plexes of prescribed size. Our first contribution is algorithm ListPlex that lists all maximal $k$-plexes in $O^*(γ^D)$ time for each constant $k$, where $γ$ is a value related to $k$ but strictly smaller than 2, and $D$ is the degeneracy of the graph that is far less than the vertex number $n$ in real-word graphs. Compared to the trivial bound of $2^n$, the improvement is significant, and our bound is better than all previously known results. In practice, we further use several techniques to accelerate listing $k$-plexes of a given size, such as structural-based prune rules, cache-efficient data structures, and parallel techniques. All these together result in a very practical algorithm. Empirical results show that our approach outperforms the state-of-the-art solutions by up to orders of magnitude.