论文标题
部分可观测时空混沌系统的无模型预测
Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms
论文作者
论文摘要
注意已成为深度学习方法中最常用的机制之一。注意机制可以帮助系统更多地关注特征空间的关键区域。例如,高振幅区域对于语音情感识别(SER)可以发挥重要作用。在本文中,我们确定了现有多头自我注意力中的注意力和信号振幅之间的不对准。为了改善注意力区域,我们建议使用焦点注意(FA)机制和新型的校准注意力(CA)机制,并结合多头自我注意力。通过FA机制,网络可以检测该细分市场中最大的振幅部分。通过采用CA机制,网络可以通过为每个注意力头分配不同权重来调节信息流,并改善周围环境的利用。为了评估所提出的方法,使用IEmocap和Ravdess数据集进行实验。实验结果表明,所提出的框架在两个数据集上的最先进方法都显着胜过。
Attention has become one of the most commonly used mechanisms in deep learning approaches. The attention mechanism can help the system focus more on the feature space's critical regions. For example, high amplitude regions can play an important role for Speech Emotion Recognition (SER). In this paper, we identify misalignments between the attention and the signal amplitude in the existing multi-head self-attention. To improve the attention area, we propose to use a Focus-Attention (FA) mechanism and a novel Calibration-Attention (CA) mechanism in combination with the multi-head self-attention. Through the FA mechanism, the network can detect the largest amplitude part in the segment. By employing the CA mechanism, the network can modulate the information flow by assigning different weights to each attention head and improve the utilization of surrounding contexts. To evaluate the proposed method, experiments are performed with the IEMOCAP and RAVDESS datasets. Experimental results show that the proposed framework significantly outperforms the state-of-the-art approaches on both datasets.