论文标题
社区搜索:一种元学习方法
Community Search: A Meta-Learning Approach
论文作者
论文摘要
社区搜索(CS)是基本的图形分析任务之一,它是各种真实应用程序的基础。鉴于任何查询节点,CS的目的是找到查询节点所属的凝聚子图。最近,设计了许多CS算法。这些算法采用预定义的子图模式来对社区进行建模,这些模式无法找到在现实世界图中没有这样的预定模式的地面真相社区。因此,提出了基于机器学习(ML)和深度学习(DL)方法,以通过数据驱动的方式向地面真实社区学习,以捕获灵活的社区结构。这些方法依靠足够的培训数据来为ML模型提供足够的概括,但是,地面确实不能事先全面收集。 在本文中,我们在小型培训数据的情况下研究了基于ML/DL的方法。我们没有直接拟合小数据,而是提取通过学习元模型在多个CS任务中共享的先验知识。每个CS任务都是一个图形,具有几个具有相应部分基础真相的查询。元模型可以迅速适应一个任务,可以通过喂养一些特定于任务的培训数据来预测。我们发现,在CS上使用多种经典的金属学习算法将有关预测有效性,概括能力和效率的问题遇到问题。为了解决此类问题,我们提出了一个新型的基于元学习的框架,条件图神经过程(CGNP),以实现先前的提取和适应程序。 Meta CGNP模型是通过基于公制的图形学习来学习的任务符号节点嵌入函数,该函数完全利用CS的特征。我们将CGNP与CS算法和ML基准与基本真相群落进行了比较。
Community Search (CS) is one of the fundamental graph analysis tasks, which is a building block of various real applications. Given any query nodes, CS aims to find cohesive subgraphs that query nodes belong to. Recently, a large number of CS algorithms are designed. These algorithms adopt predefined subgraph patterns to model the communities, which cannot find ground-truth communities that do not have such pre-defined patterns in real-world graphs. Thereby, machine learning (ML) and deep learning (DL) based approaches are proposed to capture flexible community structures by learning from ground-truth communities in a data-driven fashion. These approaches rely on sufficient training data to provide enough generalization for ML models, however, the ground-truth cannot be comprehensively collected beforehand. In this paper, we study ML/DL-based approaches for CS, under the circumstance of small training data. Instead of directly fitting the small data, we extract prior knowledge which is shared across multiple CS tasks via learning a meta model. Each CS task is a graph with several queries that possess corresponding partial ground-truth. The meta model can be swiftly adapted to a task to be predicted by feeding a few task-specific training data. We find that trivially applying multiple classical metalearning algorithms to CS suffers from problems regarding prediction effectiveness, generalization capability and efficiency. To address such problems, we propose a novel meta-learning based framework, Conditional Graph Neural Process (CGNP), to fulfill the prior extraction and adaptation procedure. A meta CGNP model is a task-common node embedding function for clustering, learned by metric-based graph learning, which fully exploits the characteristics of CS. We compare CGNP with CS algorithms and ML baselines on real graphs with ground-truth communities.