论文标题

萨里DCASE 2022任务5:带有细分级度量学习的射击生物声学事件检测很少

Surrey System for DCASE 2022 Task 5: Few-shot Bioacoustic Event Detection with Segment-level Metric Learning

论文作者

Liu, Haohe, Liu, Xubo, Mei, Xinhao, Kong, Qiuqiang, Wang, Wenwu, Plumbley, Mark D.

论文摘要

很少有音频事件检测是一项任务,可以检测出一些新型声音类的发生时间。在这项工作中,我们提出了一个基于细分级度量学习的系统,该系统针对DCASE 2022挑战的挑战(任务5)。我们可以更好地利用每个声音类中的负数据来构建损失函数,并利用托管推理来更好地适应评估集。对于输入功能,我们发现与频率频率sepstral系数相连的每通道能量归一化是最有效的组合。我们还为此任务介绍了新的数据增强和后处理程序。我们的最终系统在DCASE任务5验证集上达到了68.74的F量,表现优于29.5的基线性能。我们的系统在https://github.com/haoheliu/dcase_2022_task_5上完全开源。

Few-shot audio event detection is a task that detects the occurrence time of a novel sound class given a few examples. In this work, we propose a system based on segment-level metric learning for the DCASE 2022 challenge of few-shot bioacoustic event detection (task 5). We make better utilization of the negative data within each sound class to build the loss function, and use transductive inference to gain better adaptation on the evaluation set. For the input feature, we find the per-channel energy normalization concatenated with delta mel-frequency cepstral coefficients to be the most effective combination. We also introduce new data augmentation and post-processing procedures for this task. Our final system achieves an f-measure of 68.74 on the DCASE task 5 validation set, outperforming the baseline performance of 29.5 by a large margin. Our system is fully open-sourced at https://github.com/haoheliu/DCASE_2022_Task_5.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源