低资源多模式识别的可转移情感嵌入

论文标题

低资源多模式识别的可转移情感嵌入

Modality-Transferable Emotion Embeddings for Low-Resource Multimodal Emotion Recognition

论文作者

Dai, Wenliang, Liu, Zihan, Yu, Tiezheng, Fung, Pascale

论文摘要

尽管最近在多模式情感识别任务中取得的成就，但仍然存在两个问题，尚未得到很好的研究：1）未利用不同情绪类别之间的关系，这会导致次优表现； 2）当前的模型无法很好地应对低资源情绪，尤其是对于看不见的情绪。在本文中，我们提出了一个具有情感嵌入的模式转移模型，以解决上述问题。我们使用预训练的单词嵌入来表示文本数据的情感类别。然后，学会了两个映射功能将这些嵌入到视觉和声学空间中。对于每种模态，模型都计算输入序列和目标情绪之间的表示距离，并根据距离进行预测。通过这样做，我们的模型可以直接适应任何模式中看不见的情绪，因为我们具有预训练的嵌入和模态映射功能。实验表明，我们的模型在大多数情感类别上都实现了最先进的表现。此外，我们的模型还胜过零拍的现有基线，而几乎没有看到的情感。

Despite the recent achievements made in the multi-modal emotion recognition task, two problems still exist and have not been well investigated: 1) the relationship between different emotion categories are not utilized, which leads to sub-optimal performance; and 2) current models fail to cope well with low-resource emotions, especially for unseen emotions. In this paper, we propose a modality-transferable model with emotion embeddings to tackle the aforementioned issues. We use pre-trained word embeddings to represent emotion categories for textual data. Then, two mapping functions are learned to transfer these embeddings into visual and acoustic spaces. For each modality, the model calculates the representation distance between the input sequence and target emotions and makes predictions based on the distances. By doing so, our model can directly adapt to the unseen emotions in any modality since we have their pre-trained embeddings and modality mapping functions. Experiments show that our model achieves state-of-the-art performance on most of the emotion categories. In addition, our model also outperforms existing baselines in the zero-shot and few-shot scenarios for unseen emotions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题