基于多晶格注意力混合神经网络的文本分类

论文标题

基于多晶格注意力混合神经网络的文本分类

Text Classification based on Multi-granularity Attention Hybrid Neural Network

论文作者

Liu, Zhenyu, Lu, Chaohong, Huang, Haiwei, Lyu, Shengfei, Tao, Zhenchao

论文摘要

基于神经网络的方法已成为自然语言处理（NLP）任务的驱动力量。通常，有两个用于NLP任务的主流神经体系结构：复发性神经网络（RNN）和卷积神经网络（Convnet）。 RNN擅长于对输入文本的长期依赖性建模，但排除并行计算。 Convnets没有内存能力，它必须将顺序数据建模为未订购的功能。因此，Convnets无法学习对输入文本的顺序依赖性，但是它能够执行高效的并行计算。由于每个神经体系结构（例如RNN和Convnets）都有自己的专业和骗局，因此假定不同体系结构的集成能够丰富文本的语义表示，从而增强了NLP任务的性能。但是，很少有调查探讨这些看似不兼容的建筑的对帐。为了解决这个问题，我们提出了一种基于新型层次多晶格注意机制的混合体系结构，该机制被称为多晶格注意力集中的混合神经网络（MAHNN）。注意机制是为输入序列的不同部分分配不同的权重，以提高神经模型的计算效率和性能。在Mahnn中，引入了两种类型的关注：句法注意力和语义上的关注。句法注意计算较低符号级别的句法元素（例如单词或句子）的重要性，并使用语义注意来计算与上层语义相对应的嵌入式空间维度的重要性。我们采用文本分类作为一种例证方式来说明Mahnn理解文本的能力。

Neural network-based approaches have become the driven forces for Natural Language Processing (NLP) tasks. Conventionally, there are two mainstream neural architectures for NLP tasks: the recurrent neural network (RNN) and the convolution neural network (ConvNet). RNNs are good at modeling long-term dependencies over input texts, but preclude parallel computation. ConvNets do not have memory capability and it has to model sequential data as un-ordered features. Therefore, ConvNets fail to learn sequential dependencies over the input texts, but it is able to carry out high-efficient parallel computation. As each neural architecture, such as RNN and ConvNets, has its own pro and con, integration of different architectures is assumed to be able to enrich the semantic representation of texts, thus enhance the performance of NLP tasks. However, few investigation explores the reconciliation of these seemingly incompatible architectures. To address this issue, we propose a hybrid architecture based on a novel hierarchical multi-granularity attention mechanism, named Multi-granularity Attention-based Hybrid Neural Network (MahNN). The attention mechanism is to assign different weights to different parts of the input sequence to increase the computation efficiency and performance of neural models. In MahNN, two types of attentions are introduced: the syntactical attention and the semantical attention. The syntactical attention computes the importance of the syntactic elements (such as words or sentence) at the lower symbolic level and the semantical attention is used to compute the importance of the embedded space dimension corresponding to the upper latent semantics. We adopt the text classification as an exemplifying way to illustrate the ability of MahNN to understand texts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题