论文标题
InfiniteWalk:深网嵌入为具有非线性的Laplacian嵌入
InfiniteWalk: Deep Network Embeddings as Laplacian Embeddings with a Nonlinearity
论文作者
论文摘要
学习单词嵌入的Skip-gram模型(Mikolov等人,2013年)广泛流行,而DeepWalk(Perozzi等人,2014年)除其他方法外,已将模型扩展到从网络中学习节点表示形式。 Qiu等人的最新工作。 (2018年)为深条walk物镜提供了封闭形式的表达,从而消除了对小数据集进行采样的需求并提高了准确性。在这些方法中,将单词或节点视为共同发生的“窗口大小” T是关键的超参数。我们研究了限制的目标,随着t进入无穷大,这使我们能够简化Qiu等人的表达。我们证明,这种限制目标对应于图形laplacian的伪内versevers的简单转换,将深条小径与光谱图嵌入中的大量先前工作联系起来。此外,我们表明,通过将简单的非线性入口转换应用于此伪内,我们恢复了有限-T物镜和嵌入的良好近似,这些物镜与多标签分类中的DeepWalk和其他Skip-Gram方法具有竞争力。令人惊讶的是,我们发现Laplacian伪字词的简单二元阈值通常也具有竞争力,这表明最近方法的核心进步是基于经典光谱嵌入方法的非线性。
The skip-gram model for learning word embeddings (Mikolov et al. 2013) has been widely popular, and DeepWalk (Perozzi et al. 2014), among other methods, has extended the model to learning node representations from networks. Recent work of Qiu et al. (2018) provides a closed-form expression for the DeepWalk objective, obviating the need for sampling for small datasets and improving accuracy. In these methods, the "window size" T within which words or nodes are considered to co-occur is a key hyperparameter. We study the objective in the limit as T goes to infinity, which allows us to simplify the expression of Qiu et al. We prove that this limiting objective corresponds to factoring a simple transformation of the pseudoinverse of the graph Laplacian, linking DeepWalk to extensive prior work in spectral graph embeddings. Further, we show that by a applying a simple nonlinear entrywise transformation to this pseudoinverse, we recover a good approximation of the finite-T objective and embeddings that are competitive with those from DeepWalk and other skip-gram methods in multi-label classification. Surprisingly, we find that even simple binary thresholding of the Laplacian pseudoinverse is often competitive, suggesting that the core advancement of recent methods is a nonlinearity on top of the classical spectral embedding approach.