论文标题

一个简单的双方图投影模型,用于聚类在网络中

A simple bipartite graph projection model for clustering in networks

论文作者

Benson, Austin R., Liu, Paul, Yin, Hao

论文摘要

图数据集通常是通过双分图图的投影构造的,如果两部分图中共享一个共同的邻居,则在投影中连接两个节点;例如,共授权图是作者出版物两部分图的投影。分析投影图的结构很常见,但是我们对投影对此类分析的后果没有很好的了解。在这里,我们提出和分析一个随机图模型,以研究从投影步骤中可以期望的属性。我们的模型基于一个用于构建双方表示的chung-lu随机图,这使我们能够严格分析投影图。我们表明,可以通过这个简单的模型来解释和分析常见的网络属性,例如稀疏性,重尾度分布,节点上的本地聚类,节点程度之间的反相关关系和全局传递性。我们还为我们的模型开发了一种快速采样算法,我们证明这对于某些输入分布非常最佳。模型参数来自现实世界数据集的数值模拟显示,某些数据集中的许多聚类行为都可以通过投影步骤来解释。

Graph datasets are frequently constructed by a projection of a bipartite graph, where two nodes are connected in the projection if they share a common neighbor in the bipartite graph; for example, a coauthorship graph is a projection of an author-publication bipartite graph. Analyzing the structure of the projected graph is common, but we do not have a good understanding of the consequences of the projection on such analyses. Here, we propose and analyze a random graph model to study what properties we can expect from the projection step. Our model is based on a Chung-Lu random graph for constructing the bipartite representation, which enables us to rigorously analyze the projected graph. We show that common network properties such as sparsity, heavy-tailed degree distributions, local clustering at nodes, the inverse relationship between node degree, and global transitivity can be explained and analyzed through this simple model. We also develop a fast sampling algorithm for our model, which we show is provably optimal for certain input distributions. Numerical simulations where model parameters come from real-world datasets show that much of the clustering behavior in some datasets can just be explained by the projection step.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源