从文本显着到语言对象：使用多通道卷积架构学习语言解释标记

论文标题

从文本显着到语言对象：使用多通道卷积架构学习语言解释标记

From text saliency to linguistic objects: learning linguistic interpretable markers with a multi-channels convolutional architecture

论文作者

Vanni, Laurent, Corneli, Marco, Mayaffre, Damon, Precioso, Frédéric

论文摘要

目前，为分析和理解图像或文本分类等任务的深层神经网络的表演而提供了很多努力。这些方法主要基于可视化网络所考虑的重要输入功能以构建决策。但是，让我们引用石灰，摇摆，毕业-CAM或TD的这些技术需要额外的努力来解释有关专家知识的可视化。在本文中，我们提出了一种新颖的方法来检查拟合CNN的隐藏层，以便从利用分类过程的文本中提取可解释的语言对象。特别是，我们详细介绍了文本反卷积显着性（WTDS）度量的加权扩展，该测度可突出CNN用于执行分类任务的相关功能。我们从两种不同的语言中凭经验证明了我们对语料库的方法的效率：英语和法语。在所有数据集上，WTD都会根据共发生以及语法和语法分析自动编码复杂的语言对象。

A lot of effort is currently made to provide methods to analyze and understand deep neural network impressive performances for tasks such as image or text classification. These methods are mainly based on visualizing the important input features taken into account by the network to build a decision. However these techniques, let us cite LIME, SHAP, Grad-CAM, or TDS, require extra effort to interpret the visualization with respect to expert knowledge. In this paper, we propose a novel approach to inspect the hidden layers of a fitted CNN in order to extract interpretable linguistic objects from texts exploiting classification process. In particular, we detail a weighted extension of the Text Deconvolution Saliency (wTDS) measure which can be used to highlight the relevant features used by the CNN to perform the classification task. We empirically demonstrate the efficiency of our approach on corpora from two different languages: English and French. On all datasets, wTDS automatically encodes complex linguistic objects based on co-occurrences and possibly on grammatical and syntax analysis.

下载PDF全文

下载文献需遵守相关版权规定

论文标题