使用深层神经网络有效的阿拉伯情绪识别

论文标题

使用深层神经网络有效的阿拉伯情绪识别

Efficient Arabic emotion recognition using deep neural networks

论文作者

Ali, Ahmed, Hifny, Yasser

论文摘要

基于深度学习的语音信号的情绪识别是一个活跃的研究领域。卷积神经网络（CNN）可能是该领域的主要方法。在本文中，我们实施了两个神经体系结构来解决此问题。第一个体系结构是基于注意力的CNN-LSTM-DNN模型。在这种新颖的结构中，卷积层提取了显着特征和双向长期记忆（BLSTM）层处理语音信号的顺序现象。其次是注意力层，该层提取了一个摘要向量，该向量被馈送到完全连接的密集层（DNN），该层最终连接到SoftMax输出层。第二个体系结构基于Deep CNN模型。阿拉伯语语音情感识别任务的结果表明，我们的创新方法可以使强大的Deep CNN基线系统能够显着改善（绝对改善2.2％）。另一方面，在训练和分类中，深CNN模型的速度明显快于基于注意力的CNN-LSTM-DNN模型。

Emotion recognition from speech signal based on deep learning is an active research area. Convolutional neural networks (CNNs) may be the dominant method in this area. In this paper, we implement two neural architectures to address this problem. The first architecture is an attention-based CNN-LSTM-DNN model. In this novel architecture, the convolutional layers extract salient features and the bi-directional long short-term memory (BLSTM) layers handle the sequential phenomena of the speech signal. This is followed by an attention layer, which extracts a summary vector that is fed to the fully connected dense layer (DNN), which finally connects to a softmax output layer. The second architecture is based on a deep CNN model. The results on an Arabic speech emotion recognition task show that our innovative approach can lead to significant improvements (2.2% absolute improvements) over a strong deep CNN baseline system. On the other hand, the deep CNN models are significantly faster than the attention based CNN-LSTM-DNN models in training and classification.

下载PDF全文

下载文献需遵守相关版权规定

论文标题