每个文档都有其结构：通过图神经网络的归纳文本分类

论文标题

每个文档都有其结构：通过图神经网络的归纳文本分类

Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks

论文作者

Zhang, Yufeng, Yu, Xueli, Cui, Zeyu, Wu, Shu, Wen, Zhongzhen, Wang, Liang

论文摘要

文本分类是自然语言处理（NLP）的基础，最近在此任务中应用了图神经网络（GNN）。但是，现有的基于图的作品既不能捕获每个文档中的上下文单词关系，也不能满足新单词的归纳学习。在这项工作中，为了克服此类问题，我们提出了通过GNN进行归纳文本分类的短信。我们首先为每个文档构建单个图形，然后使用GNN根据其本地结构来学习细颗粒的单词表示形式，这也可以在新文档中有效地为看不见的单词产生嵌入。最后，单词节点被汇总为文档嵌入。四个基准数据集的广泛实验表明，我们的方法优于最先进的文本分类方法。

Text classification is fundamental in natural language processing (NLP), and Graph Neural Networks (GNN) are recently applied in this task. However, the existing graph-based works can neither capture the contextual word relationships within each document nor fulfil the inductive learning of new words. In this work, to overcome such problems, we propose TextING for inductive text classification via GNN. We first build individual graphs for each document and then use GNN to learn the fine-grained word representations based on their local structures, which can also effectively produce embeddings for unseen words in the new document. Finally, the word nodes are aggregated as the document embedding. Extensive experiments on four benchmark datasets show that our method outperforms state-of-the-art text classification methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题