超越现实世界基准数据集：与GNN的节点分类的实证研究

论文标题

超越现实世界基准数据集：与GNN的节点分类的实证研究

Beyond Real-world Benchmark Datasets: An Empirical Study of Node Classification with GNNs

论文作者

Maekawa, Seiji, Noda, Koki, Sasaki, Yuya, Onizuka, Makoto

论文摘要

图形神经网络（GNN）在节点分类任务上取得了巨大成功。尽管对开发和评估GNN具有广泛的兴趣，但它们的基准数据集对它们进行了评估。结果，现有的GNN评估缺乏来自图的各种特征的细粒分析。在此激励的情况下，我们使用合成图生成器进行了广泛的实验，该实验可以生成具有控制特征以进行细粒分析的图形。我们的经验研究阐明了带有节点类标签的四个主要图表的GNN的优势和劣势，即1）类规模分布（平衡与失衡），2）等级连接比例（2）类之间的边缘连接比例（均应vs. heterophilic），3）属性值（3）属性vs. vs. shill vs. vs. smill vs.和4）。此外，为了促进对GNN的未来研究，我们公开发布了我们的代码库，该代码库允许用户用各种图表评估各种GNN。我们希望这项工作为未来的研究提供有趣的见解。

Graph Neural Networks (GNNs) have achieved great success on a node classification task. Despite the broad interest in developing and evaluating GNNs, they have been assessed with limited benchmark datasets. As a result, the existing evaluation of GNNs lacks fine-grained analysis from various characteristics of graphs. Motivated by this, we conduct extensive experiments with a synthetic graph generator that can generate graphs having controlled characteristics for fine-grained analysis. Our empirical studies clarify the strengths and weaknesses of GNNs from four major characteristics of real-world graphs with class labels of nodes, i.e., 1) class size distributions (balanced vs. imbalanced), 2) edge connection proportions between classes (homophilic vs. heterophilic), 3) attribute values (biased vs. random), and 4) graph sizes (small vs. large). In addition, to foster future research on GNNs, we publicly release our codebase that allows users to evaluate various GNNs with various graphs. We hope this work offers interesting insights for future research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题