论文标题
有效的随机步行基于逆度
Efficient Random Walk based Sampling with Inverse Degree
论文作者
论文摘要
近年来,随机步行采样方法已被广泛用于图形采样,而该方法对样本中的高度节点具有偏差。为了克服这种缺陷,经典方法,例如通过重复低度节点的同时拒绝高度节点来加权行走,以便马尔可夫链的长期行为可以实现均匀的分布。但是,这种修改可能会使采样器在同一节点中保持几次,从而导致采样。为了解决此问题,我们提出了一个仅需要当前和候选节点学位的采样框架来提高图形采样方法的性能。我们还将我们的原始想法扩展到了一个更一般的框架。我们的扩展IDRW方法在MHRW中发现了SRW的大偏差问题和样本排斥问题之间的平衡。我们通过在各种现实世界数据集上进行大量实验来评估模拟技术的技术,结果表明,与最先进的技术相比,我们的方法提高了准确性。我们还研究了参数的效果,并给出了建议的范围,以更好地使用应用。
Random walk sampling methods have been widely used in graph sampling in recent years, while it has bias towards higher degree nodes in the sample. To overcome this deficiency, classical methods such as MHRW design weighted walking by repeating low-degree nodes while rejecting high-degree nodes, so that the long-term behavior of Markov chain can achieve uniform distribution. This modification, however, may make the sampler stay in the same node for several times, leading to undersampling. To address this issue, we propose a sampling framework that only need current and candidate node degree to improve the performance of graph sampling methods. We also extend our original idea to a more general framework. Our extended IDRW method finds a balance between the large deviation problem of SRW and sample rejection problem in MHRW. We evaluate our technique in simulation by running extensive experiments on various real-world datasets, and the result show that our method improves the accuracy compared with the state of art techniques. We also investigate the effect of the parameter and give the suggested range for a better usage in application.