论文标题
关于图形神经网络及其实际含义的瓶颈
On the Bottleneck of Graph Neural Networks and its Practical Implications
论文作者
论文摘要
由于Gori等人提出的图形神经网络(GNN)的建议。 (2005)和Scarselli等。 (2008年),培训GNNS的主要问题之一是他们在图中遥远节点之间传播信息的努力。我们为这个问题提出了一个新的解释:当跨长途路径汇总消息时,GNN容易受到瓶颈的影响。这种瓶颈导致将指数增长的信息过度划分为固定尺寸的向量。结果,GNN无法传播源自遥远节点的消息,并且当预测任务取决于远程交互时的性能很差。在本文中,我们强调了GNNS中过度方面的固有问题:我们证明瓶颈阻碍了流行的GNNS在培训数据中拟合长期信号;我们进一步表明,与GCN和GIN一样吸收传入边缘的GNN比GAT和GGNN更容易过度易变。最后,我们表明,先前的工作对远程问题的GNN模型进行了广泛调整,遭受了过度的折磨,而打破瓶颈则可以改善其最先进的结果,而无需进行任何调整或其他重量。我们的代码可在https://github.com/tech-srl/bottleneck/上找到。
Since the proposal of the graph neural network (GNN) by Gori et al. (2005) and Scarselli et al. (2008), one of the major problems in training GNNs was their struggle to propagate information between distant nodes in the graph. We propose a new explanation for this problem: GNNs are susceptible to a bottleneck when aggregating messages across a long path. This bottleneck causes the over-squashing of exponentially growing information into fixed-size vectors. As a result, GNNs fail to propagate messages originating from distant nodes and perform poorly when the prediction task depends on long-range interaction. In this paper, we highlight the inherent problem of over-squashing in GNNs: we demonstrate that the bottleneck hinders popular GNNs from fitting long-range signals in the training data; we further show that GNNs that absorb incoming edges equally, such as GCN and GIN, are more susceptible to over-squashing than GAT and GGNN; finally, we show that prior work, which extensively tuned GNN models of long-range problems, suffers from over-squashing, and that breaking the bottleneck improves their state-of-the-art results without any tuning or additional weights. Our code is available at https://github.com/tech-srl/bottleneck/ .