论文标题
Bayan算法:通过模块化的精确和近似优化检测网络中的社区
Bayan Algorithm: Detecting Communities in Networks Through Exact and Approximate Optimization of Modularity
论文作者
论文摘要
社区检测是一个经典的网络问题,在各个领域都有广泛的应用。它最常见的方法是使用模块化最大化的启发式方法,该启发式方法很少返回最佳分区或类似的任何东西。具有全球最佳模块化的分区很难计算,因此没有被逐渐解散。使用结构上多样化的网络,我们比较了30种社区检测方法,包括我们提出的算法,该算法提供了最佳和近似保证:Bayan算法。与现有方法不同,Bayan在全球范围内最大化模块化或将其近似于一个因素。我们的结果表明,在两个标准基准中,以高于大多数替代方案检索种植的速率的最大模块化分区的最大准确性和稳定性。与其他29种算法的分区相比,最大模块化分区的中位数最佳,用于描述长度,覆盖范围,性能,平均电导率和聚集良好。这些优点是以Bayan为小型网络(在其最大的连接组件中具有3000个边缘的网络)所能实现的其他计算为代价的。 Bayan比使用开源和商业求解器来模块化最大化的速度快几倍,这使其能够为无法通过任何其他现有方法优化的实例找到最佳分区。我们的结果指出了一些表现出色的算法,其中Bayan是小型网络最可靠的方法。 Bayan算法(Bayanpy)的Python实施可通过Python的包装安装程序公开获得。
Community detection is a classic network problem with extensive applications in various fields. Its most common method is using modularity maximization heuristics which rarely return an optimal partition or anything similar. Partitions with globally optimal modularity are difficult to compute, and therefore have been underexplored. Using structurally diverse networks, we compare 30 community detection methods including our proposed algorithm that offers optimality and approximation guarantees: the Bayan algorithm. Unlike existing methods, Bayan globally maximizes modularity or approximates it within a factor. Our results show the distinctive accuracy and stability of maximum-modularity partitions in retrieving planted partitions at rates higher than most alternatives for a wide range of parameter settings in two standard benchmarks. Compared to the partitions from 29 other algorithms, maximum-modularity partitions have the best medians for description length, coverage, performance, average conductance, and well clusteredness. These advantages come at the cost of additional computations which Bayan makes possible for small networks (networks that have up to 3000 edges in their largest connected component). Bayan is several times faster than using open-source and commercial solvers for modularity maximization, making it capable of finding optimal partitions for instances that cannot be optimized by any other existing method. Our results point to a few well performing algorithms, among which Bayan stands out as the most reliable method for small networks. A Python implementation of the Bayan algorithm (bayanpy) is publicly available through the package installer for Python.