一切都很好，奶奶吗？强大的老年语音情感识别的声学和语言模型

论文标题

一切都很好，奶奶吗？强大的老年语音情感识别的声学和语言模型

Is Everything Fine, Grandma? Acoustic and Linguistic Modeling for Robust Elderly Speech Emotion Recognition

论文作者

Soğancıoğlu, Gizem, Verkholyak, Oxana, Kaya, Heysem, Fedotov, Dmitrii, Cadèe, Tobias, Salah, Albert Ali, Karpov, Alexey

论文摘要

老年人情绪识别的声学和语言分析是一个研究不足和具有挑战性的研究方向，但对于为老年人创造数字助手以及老年人在其住所中为精神保健目的而对老年人的远程监控至关重要。本文介绍了我们对Interspeech 2020计算副语言学挑战（比较）的贡献 - 老年人情绪亚挑战，该挑战由两个用于唤醒和价值识别的三元分类任务组成。我们提出了一个双模式框架，其中这些任务分别是使用最先进的声学和语言特征建模的。在这项研究中，我们证明，当标记数据的量很小时，利用特定任务的词典和资源可以提高语言模型的性能。观察到各种模型的开发和测试集表现之间的不匹配，我们还提出了替代性培训和决策融合策略，以更好地估计和改善概括性能。

Acoustic and linguistic analysis for elderly emotion recognition is an under-studied and challenging research direction, but essential for the creation of digital assistants for the elderly, as well as unobtrusive telemonitoring of elderly in their residences for mental healthcare purposes. This paper presents our contribution to the INTERSPEECH 2020 Computational Paralinguistics Challenge (ComParE) - Elderly Emotion Sub-Challenge, which is comprised of two ternary classification tasks for arousal and valence recognition. We propose a bi-modal framework, where these tasks are modeled using state-of-the-art acoustic and linguistic features, respectively. In this study, we demonstrate that exploiting task-specific dictionaries and resources can boost the performance of linguistic models, when the amount of labeled data is small. Observing a high mismatch between development and test set performances of various models, we also propose alternative training and decision fusion strategies to better estimate and improve the generalization performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题