论文标题

动态关注的可区分窗口

Differentiable Window for Dynamic Local Attention

论文作者

Nguyen, Thanh-Tung, Nguyen, Xuan-Phi, Joty, Shafiq, Li, Xiaoli

论文摘要

我们提出了可区分的窗口,一个新的神经模块和用于动态窗口选择的通用组件。虽然普遍适用,但我们证明了一种令人信服的用例,即利用可区分的窗口来通过对输入区域进行更多专注的关注来改善标准注意模块。我们提出了两个可区分窗口的变体,并以两种新颖的方式将它们集成到变压器体系结构中。我们在无数的NLP任务上评估了我们提出的方法,包括机器翻译,情感分析,主题 - 动词协议和语言建模。我们的实验结果表明,所有任务的一致和相当大的改进。

We propose Differentiable Window, a new neural module and general purpose component for dynamic window selection. While universally applicable, we demonstrate a compelling use case of utilizing Differentiable Window to improve standard attention modules by enabling more focused attentions over the input regions. We propose two variants of Differentiable Window, and integrate them within the Transformer architecture in two novel ways. We evaluate our proposed approach on a myriad of NLP tasks, including machine translation, sentiment analysis, subject-verb agreement and language modeling. Our experimental results demonstrate consistent and sizable improvements across all tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源