ACM多媒体2022计算副语言学挑战：发声，口吃，活动和蚊子

论文标题

ACM多媒体2022计算副语言学挑战：发声，口吃，活动和蚊子

The ACM Multimedia 2022 Computational Paralinguistics Challenge: Vocalisations, Stuttering, Activity, & Mosquitoes

论文作者

Schuller, Björn W., Batliner, Anton, Amiriparian, Shahin, Bergler, Christian, Gerczuk, Maurice, Holz, Natalie, Larrouy-Maestri, Pauline, Bayerl, Sebastian P., Riedhammer, Korbinian, Mallol-Ragolta, Adria, Pateraki, Maria, Coppock, Harry, Kiskin, Ivan, Sinka, Marianne, Roberts, Stephen

论文摘要

ACM多媒体2022计算副语言学挑战挑战在明确定义的条件下首次解决了四个不同的问题：在发声和口吃的子挑战中，必须对人类的非语言发声和言语进行分类；该活动次挑战的目的是从智能手表传感器数据中获得超越人类活动的识别。在蚊子亚挑战中，需要检测到蚊子。我们使用DeepSpectrum Toolkit来描述基准和boaw特征，Audeep工具包和深度特征提取的基线和分类器的基线特征提取和分类器；此外，我们添加了端到端的顺序建模和log-mel-128-BNN。

The ACM Multimedia 2022 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the Vocalisations and Stuttering Sub-Challenges, a classification on human non-verbal vocalisations and speech has to be made; the Activity Sub-Challenge aims at beyond-audio human activity recognition from smartwatch sensor data; and in the Mosquitoes Sub-Challenge, mosquitoes need to be detected. We describe the Sub-Challenges, baseline feature extraction, and classifiers based on the usual ComPaRE and BoAW features, the auDeep toolkit, and deep feature extraction from pre-trained CNNs using the DeepSpectRum toolkit; in addition, we add end-to-end sequential modelling, and a log-mel-128-BNN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题