论文标题
$ p $ - 纳米流量扩散用于本地图集群
$p$-Norm Flow Diffusion for Local Graph Clustering
论文作者
论文摘要
局部图聚类和紧密相关的种子集扩展问题是图表上的原始图,这些图是广泛的分析和学习任务的核心,例如本地聚类,社区检测,节点排名和特征推断。对本地图聚类的先前工作主要属于两类,分别具有数值和组合根。在这项工作中,我们从这两个领域中汲取灵感,并根据$ p \ in(1,\ infty)$的$ p \ in-p-norm网络流的扩散概念提出了一个凸优化公式的家族。在局部聚类的背景下,我们表征了这些优化问题的最佳解决方案,并显示了它们在围绕输入种子集的低电导降低时的有用性。特别是,在$ p = 2 $的情况下,我们实现了二次电导近似,类似于光谱方法的cheeger型界限,当$ p \ rightarrow \ rightarrow \ infty $类似于基于max-flow的方法的情况下,恒定因子近似值,以及之间的一般$ p $ p $ p $。因此,我们的优化公式可以看作是桥接数值和组合方法,并且我们可以在速度和噪声稳健性方面实现两全其美。我们表明,提出的问题可以在$ p \ ge 2 $的强烈本地运行时间中解决,并在合成和现实世界图上进行经验评估,以说明我们的方法与现有方法相比有利。
Local graph clustering and the closely related seed set expansion problem are primitives on graphs that are central to a wide range of analytic and learning tasks such as local clustering, community detection, nodes ranking and feature inference. Prior work on local graph clustering mostly falls into two categories with numerical and combinatorial roots respectively. In this work, we draw inspiration from both fields and propose a family of convex optimization formulations based on the idea of diffusion with p-norm network flow for $p\in (1,\infty)$. In the context of local clustering, we characterize the optimal solutions for these optimization problems and show their usefulness in finding low conductance cuts around input seed set. In particular, we achieve quadratic approximation of conductance in the case of $p=2$ similar to the Cheeger-type bounds of spectral methods, constant factor approximation when $p\rightarrow\infty$ similar to max-flow based methods, and a smooth transition for general $p$ values in between. Thus, our optimization formulation can be viewed as bridging the numerical and combinatorial approaches, and we can achieve the best of both worlds in terms of speed and noise robustness. We show that the proposed problem can be solved in strongly local running time for $p\ge 2$ and conduct empirical evaluations on both synthetic and real-world graphs to illustrate our approach compares favorably with existing methods.