论文标题
通过学习解开因果子结构来使图形神经网络进行偏见
Debiasing Graph Neural Networks via Learning Disentangled Causal Substructure
论文作者
论文摘要
大多数图形神经网络(GNN)通过学习输入图和标签之间的相关性来预测看不见的图的标签。但是,通过对具有严重偏见的训练图进行图形分类调查,我们发现GNN始终倾向于探索伪造的相关性以做出决定,即使因果关系始终存在。这意味着在此类有偏见的数据集中接受过培训的现有GNN将遭受概括能力差。通过在因果观点中分析这个问题,我们发现从偏见图中解开和去偏置因果和偏见的潜在变量对于偏见至关重要。在此鼓舞下,我们提出了一个普遍的分解GNN框架,分别学习因果子结构和偏见子结构。特别是,我们设计了一个参数化的边蒙版生成器,以将输入图明确分为因果和偏置子图。然后,分别对因因果/偏见感知损失函数监督的两个GNN模块进行了培训,以编码因果关系和偏见子图表中的相应表示。通过分离的表示,我们合成了反事实无偏的训练样本,以进一步脱离因果变量和偏见变量。此外,为了更好地基于严重的偏见问题,我们构建了三个新的图形数据集,这些数据集具有可控的偏差度,并且更易于可视化和解释。实验结果很好地表明,我们的方法比现有基线实现了优越的概括性能。此外,由于学习的边缘面具,该拟议的模型具有吸引人的可解释性和可转让性。代码和数据可在以下网址获得:https://github.com/googlebaba/disc。
Most Graph Neural Networks (GNNs) predict the labels of unseen graphs by learning the correlation between the input graphs and labels. However, by presenting a graph classification investigation on the training graphs with severe bias, surprisingly, we discover that GNNs always tend to explore the spurious correlations to make decision, even if the causal correlation always exists. This implies that existing GNNs trained on such biased datasets will suffer from poor generalization capability. By analyzing this problem in a causal view, we find that disentangling and decorrelating the causal and bias latent variables from the biased graphs are both crucial for debiasing. Inspiring by this, we propose a general disentangled GNN framework to learn the causal substructure and bias substructure, respectively. Particularly, we design a parameterized edge mask generator to explicitly split the input graph into causal and bias subgraphs. Then two GNN modules supervised by causal/bias-aware loss functions respectively are trained to encode causal and bias subgraphs into their corresponding representations. With the disentangled representations, we synthesize the counterfactual unbiased training samples to further decorrelate causal and bias variables. Moreover, to better benchmark the severe bias problem, we construct three new graph datasets, which have controllable bias degrees and are easier to visualize and explain. Experimental results well demonstrate that our approach achieves superior generalization performance over existing baselines. Furthermore, owing to the learned edge mask, the proposed model has appealing interpretability and transferability. Code and data are available at: https://github.com/googlebaba/DisC.