通过采样和压缩来学习：有限的注释有效的图形表示学习

论文标题

通过采样和压缩来学习：有限的注释有效的图形表示学习

Learning by Sampling and Compressing: Efficient Graph Representation Learning with Extremely Limited Annotations

论文作者

Liu, Xiaoming, Li, Qirui, Shen, Chao, Peng, Xi, Zhou, Yadong, Guan, Xiaohong

论文摘要

图形卷积网络（GCN）吸引了广泛的研究兴趣。尽管现有的工作主要集中于设计新颖的GCN体系结构以提高性能，但很少有人研究一个实用而又具有挑战性的问题：如何从数据中从数据中学习GCN？在本文中，我们通过对策略和模型压缩提出了一种新的学习方法，以克服这一挑战。我们的方法具有多重优势：1）自适应抽样策略在很大程度上抑制了GCN训练偏差，而不是均匀的采样； 2）具有较小参数的压缩基于GCN的方法需要更少的标记数据才能训练； 3）较小规模的培训数据有益于降低人力资源成本以标记它们。我们选择六个流行的GCN基线，并在三个现实世界数据集上进行大量实验。结果表明，通过应用我们的方法，所有GCN基准都将注释要求减少多达90美元$ \％$，并在不牺牲其强大的性能的情况下压缩超过6 $ \ times $的参数规模。它验证了训练方法可以将现有的半监督基于GCN的方法扩展到具有极小标记的数据的情况下。

Graph convolution network (GCN) attracts intensive research interest with broad applications. While existing work mainly focused on designing novel GCN architectures for better performance, few of them studied a practical yet challenging problem: How to learn GCNs from data with extremely limited annotation? In this paper, we propose a new learning method by sampling strategy and model compression to overcome this challenge. Our approach has multifold advantages: 1) the adaptive sampling strategy largely suppresses the GCN training deviation over uniform sampling; 2) compressed GCN-based methods with a smaller scale of parameters need fewer labeled data to train; 3) the smaller scale of training data is beneficial to reduce the human resource cost to label them. We choose six popular GCN baselines and conduct extensive experiments on three real-world datasets. The results show that by applying our method, all GCN baselines cut down the annotation requirement by as much as 90$\%$ and compress the scale of parameters more than 6$\times$ without sacrificing their strong performance. It verifies that the training method could extend the existing semi-supervised GCN-based methods to the scenarios with the extremely small scale of labeled data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题