用哈希嵌入压缩，以在大规模图中进行有效表示学习

论文标题

用哈希嵌入压缩，以在大规模图中进行有效表示学习

Embedding Compression with Hashing for Efficient Representation Learning in Large-Scale Graph

论文作者

Yeh, Chin-Chia Michael, Gu, Mengting, Zheng, Yan, Chen, Huiyuan, Ebrahimi, Javid, Zhuang, Zhongfang, Wang, Junpeng, Wang, Liang, Zhang, Wei

论文摘要

图形神经网络（GNN）是专门为图形数据设计的深度学习模型，它们通常依靠节点特征作为第一层的输入。在没有节点功能的图形上应用这种类型的网络时，可以提取简单的基于图的节点特征（例如，度数数）或在训练网络时学习输入节点表示（即嵌入）。训练节点嵌入的后一种方法，更有可能导致性能更好，而与嵌入的参数数量与节点数量线性增长。因此，在处理工业规模的图形数据时，以端到端方式训练输入节点嵌入与GNN一起训练输入节点嵌入式（GPU）内存是不切实际的。受到为自然语言处理（NLP）任务开发的嵌入压缩方法的启发，我们开发了一种节点嵌入压缩方法，其中每个节点都用位矢量而不是浮点数向量表示。在压缩方法中使用的参数可以与GNN一起训练。我们表明，与替代方案相比，提出的节点嵌入压缩方法的性能优于性能。

Graph neural networks (GNNs) are deep learning models designed specifically for graph data, and they typically rely on node features as the input to the first layer. When applying such a type of network on the graph without node features, one can extract simple graph-based node features (e.g., number of degrees) or learn the input node representations (i.e., embeddings) when training the network. While the latter approach, which trains node embeddings, more likely leads to better performance, the number of parameters associated with the embeddings grows linearly with the number of nodes. It is therefore impractical to train the input node embeddings together with GNNs within graphics processing unit (GPU) memory in an end-to-end fashion when dealing with industrial-scale graph data. Inspired by the embedding compression methods developed for natural language processing (NLP) tasks, we develop a node embedding compression method where each node is compactly represented with a bit vector instead of a floating-point vector. The parameters utilized in the compression method can be trained together with GNNs. We show that the proposed node embedding compression method achieves superior performance compared to the alternatives.

下载PDF全文

下载文献需遵守相关版权规定

论文标题