标签技巧：使用图形神经网络进行多节点表示学习的理论

论文标题

标签技巧：使用图形神经网络进行多节点表示学习的理论

Labeling Trick: A Theory of Using Graph Neural Networks for Multi-Node Representation Learning

论文作者

Zhang, Muhan, Li, Pan, Xia, Yinglong, Wang, Kai, Jin, Long

论文摘要

在本文中，我们提供了一种使用图形神经网络（GNN）进行多节点表示学习的理论（我们有兴趣学习一组以上节点的表示形式，例如链接）。我们知道，GNN旨在学习单个节点表示。当我们想学习一个涉及多个节点的节点集表示时，以前的工作中的一种常见实践是将GNN获得的单节点表示形式直接汇总为关节节点集表示。在本文中，我们显示了这种方法的基本约束，即无法捕获节点集中的节点之间的依赖性，并认为直接汇总的单个节点表示并不会导致多个节点的有效关节表示。然后，我们注意到，以前的一些成功的多节点表示学习的成功作品，包括密封，距离编码和ID-GNN，为所有使用的节点标签。这些方法在应用GNN之前，根据图形与目标节点集的关系首先将其标记。然后，将标记图中获得的节点表示汇总为节点集表示。通过研究其内部机制，我们将这些节点标记技术统一为单一，最通用的形式 - 标记技巧。我们证明，通过标记技巧，足够表达的GNN学习了最具表现力的节点集表示形式，因此原则上可以解决节点集的任何关节学习任务。对一项重要的两节点表示任务的实验，链接预测，验证了我们的理论。我们的工作解释了以前基于节点标记的方法的出色性能，并为使用GNN用于多节点表示学习的理论基础。

In this paper, we provide a theory of using graph neural networks (GNNs) for multi-node representation learning (where we are interested in learning a representation for a set of more than one node, such as link). We know that GNN is designed to learn single-node representations. When we want to learn a node set representation involving multiple nodes, a common practice in previous works is to directly aggregate the single-node representations obtained by a GNN into a joint node set representation. In this paper, we show a fundamental constraint of such an approach, namely the inability to capture the dependence between nodes in the node set, and argue that directly aggregating individual node representations does not lead to an effective joint representation for multiple nodes. Then, we notice that a few previous successful works for multi-node representation learning, including SEAL, Distance Encoding, and ID-GNN, all used node labeling. These methods first label nodes in the graph according to their relationships with the target node set before applying a GNN. Then, the node representations obtained in the labeled graph are aggregated into a node set representation. By investigating their inner mechanisms, we unify these node labeling techniques into a single and most general form -- labeling trick. We prove that with labeling trick a sufficiently expressive GNN learns the most expressive node set representations, thus in principle solves any joint learning tasks over node sets. Experiments on one important two-node representation learning task, link prediction, verified our theory. Our work explains the superior performance of previous node-labeling-based methods, and establishes a theoretical foundation of using GNNs for multi-node representation learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题