通过融合音乐和电线活动信号来实现大规模情感识别的有效多模式框架

论文标题

通过融合音乐和电线活动信号来实现大规模情感识别的有效多模式框架

A Efficient Multimodal Framework for Large Scale Emotion Recognition by Fusing Music and Electrodermal Activity Signals

论文作者

Yin, Guanghao, Sun, Shouqian, Yu, Dian, Li, Dejian, Zhang, Kejun

论文摘要

在情感计算领域中，对基于生理信号的情绪识别引起了很大的关注。对于可靠性和用户友好的获取，电肌活动（EDA）在实际应用中具有很大的优势。但是，基于EDA的情感识别数百名受试者仍然缺乏有效的解决方案。在本文中，我们的工作试图融合主题单个EDA功能和外部唤起音乐功能。我们提出了一个端到端的多模式框架，即一维残留时间和频道注意网络（RTCAN-1D）。对于EDA功能，将基于新型凸优化的EDA（CVXEDA）方法应用于将EDA信号分解为PAHSIC和TONIC信号，以挖掘动态和稳定的特征。首先涉及基于EDA的情绪识别的渠道周期注意机制，以改善时间和渠道的表示。对于音乐功能，我们使用开源工具包opensmile处理音乐信号，以获取外部功能向量。 EDA信号和音乐外部情感基准的个人情感特征融合在分类层中。我们已经在三个多模式数据集（PMEMO，DEAP，AMIGOS）上进行了系统比较，以进行两类价值/唤醒情绪识别。我们提出的RTCAN-1D的表现优于现有的最新模型，这也验证了我们的工作为大规模情感识别提供了可靠，有效的解决方案。我们的代码已在https://github.com/guanghaoyin/rtcan-1d上发布。

Considerable attention has been paid for physiological signal-based emotion recognition in field of affective computing. For the reliability and user friendly acquisition, Electrodermal Activity (EDA) has great advantage in practical applications. However, the EDA-based emotion recognition with hundreds of subjects still lacks effective solution. In this paper, our work makes an attempt to fuse the subject individual EDA features and the external evoked music features. And we propose an end-to-end multimodal framework, the 1-dimensional residual temporal and channel attention network (RTCAN-1D). For EDA features, the novel convex optimization-based EDA (CvxEDA) method is applied to decompose EDA signals into pahsic and tonic signals for mining the dynamic and steady features. The channel-temporal attention mechanism for EDA-based emotion recognition is firstly involved to improve the temporal- and channel-wise representation. For music features, we process the music signal with the open source toolkit openSMILE to obtain external feature vectors. The individual emotion features from EDA signals and external emotion benchmarks from music are fused in the classifing layers. We have conducted systematic comparisons on three multimodal datasets (PMEmo, DEAP, AMIGOS) for 2-classes valance/arousal emotion recognition. Our proposed RTCAN-1D outperforms the existing state-of-the-art models, which also validate that our work provides an reliable and efficient solution for large scale emotion recognition. Our code has been released at https://github.com/guanghaoyin/RTCAN-1D.

下载PDF全文

下载文献需遵守相关版权规定

论文标题