论文标题

忠于您的话:(如何)注意力可以解释?

Staying True to Your Word: (How) Can Attention Become Explanation?

论文作者

Tutek, Martin, Šnajder, Jan

论文摘要

注意机制在NLP中迅速变得无处不在。除了提高模型的性能外,注意力还广泛用作NLP模型内部运作的瞥见。近年来,后一个方面已成为讨论的共同话题,最著名的是Jain和Wallace的工作,2019年; Wiegreffe和Pinter,2019年。由于揭示了使用注意权重作为透明度的工具的缺点,注意机制已被困在困境中,而没有具体的证明何时以及是否可以用作解释。在本文中,我们提供了一个解释,说明为什么关注在序列分类任务中与经常性网络一起使用时,关注的批评是正确的。我们以单词级别目标的形式提出了对这些问题的补救措施,我们的发现使人们的注意力可信,以提供对经常性模型的忠实解释。

The attention mechanism has quickly become ubiquitous in NLP. In addition to improving performance of models, attention has been widely used as a glimpse into the inner workings of NLP models. The latter aspect has in the recent years become a common topic of discussion, most notably in work of Jain and Wallace, 2019; Wiegreffe and Pinter, 2019. With the shortcomings of using attention weights as a tool of transparency revealed, the attention mechanism has been stuck in a limbo without concrete proof when and whether it can be used as an explanation. In this paper, we provide an explanation as to why attention has seen rightful critique when used with recurrent networks in sequence classification tasks. We propose a remedy to these issues in the form of a word level objective and our findings give credibility for attention to provide faithful interpretations of recurrent models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源