适用于层融合的多模式BERT用于情感分析

论文标题

适用于层融合的多模式BERT用于情感分析

Adapted Multimodal BERT with Layer-wise Fusion for Sentiment Analysis

论文作者

Chlapanis, Odysseas S., Paraskevopoulos, Georgios, Potamianos, Alexandros

论文摘要

多模式学习管道从经过审计的语言模型的成功中受益。但是，这是以增加模型参数为代价的。在这项工作中，我们提出了改编的多模式BERT（ABS），这是一种基于BERT的架构，用于使用适配器模块和中间融合层组合的多模式任务。适配器调整了手头任务的验证语言模型，而Fusion层则使用文本BERT表示的视听信息进行特定于任务的层次融合。在适应过程中，预先训练的语言模型参数保持冷冻，从而进行快速，参数有效的训练。在我们的消融中，我们看到这种方法会导致有效的模型，从而胜过其微调的对应物，并且对输入噪声非常有力。我们对CMU-Mosei进行情感分析的实验表明，AMB的表现均优于跨指标的当前最新面积，所得误差相对相对减少了3.4％，而7级分类精度的相对提高了2.1％。

Multimodal learning pipelines have benefited from the success of pretrained language models. However, this comes at the cost of increased model parameters. In this work, we propose Adapted Multimodal BERT (AMB), a BERT-based architecture for multimodal tasks that uses a combination of adapter modules and intermediate fusion layers. The adapter adjusts the pretrained language model for the task at hand, while the fusion layers perform task-specific, layer-wise fusion of audio-visual information with textual BERT representations. During the adaptation process the pre-trained language model parameters remain frozen, allowing for fast, parameter-efficient training. In our ablations we see that this approach leads to efficient models, that can outperform their fine-tuned counterparts and are robust to input noise. Our experiments on sentiment analysis with CMU-MOSEI show that AMB outperforms the current state-of-the-art across metrics, with 3.4% relative reduction in the resulting error and 2.1% relative improvement in 7-class classification accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题