论文标题
基于两阶段分类和数据增强的设备射击声音场景分类
Device-Robust Acoustic Scene Classification Based on Two-Stage Categorization and Data Augmentation
论文作者
论文摘要
在这份技术报告中,我们提出了四个小组的共同努力,即GT,USTC,Tencent和UKE,以解决DCASE 2020挑战中的任务1-声学场景分类(ASC)。任务1包括两个不同的子任务:(i)任务1A重点介绍了带有多个(真实和模拟)设备的音频信号的ASC,将数据分类为十个不同的细粒度类别,以及(ii)将数据分类为三个使用低复杂性解决方案的高级类别的任务1B关注点。对于任务1a,我们提出了一个新型的两阶段ASC系统,该系统在两个卷积神经网络(CNN)的临时得分组合中利用,分别根据三个类别和十个类别对声学输入进行分类。探索了四个不同的基于CNN的架构以实现两阶段的分类器,还研究了几种数据增强技术。对于任务1B,我们利用一种量化方法来降低我们两种最高准确性三级基于CNN的体系结构的复杂性。在任务1A开发数据集上,使用我们最佳的单个分类器和数据增强来达到76.9 \%的ASC精度。然后,通过我们的两阶段ASC分类器的最终模型融合来达到81.9 \%的精度。在任务1B开发数据集上,我们达到的精度为96.7 \%,模型尺寸小于500kb。代码可用:https://github.com/mihawkhu/dcase2020_task1。
In this technical report, we present a joint effort of four groups, namely GT, USTC, Tencent, and UKE, to tackle Task 1 - Acoustic Scene Classification (ASC) in the DCASE 2020 Challenge. Task 1 comprises two different sub-tasks: (i) Task 1a focuses on ASC of audio signals recorded with multiple (real and simulated) devices into ten different fine-grained classes, and (ii) Task 1b concerns with classification of data into three higher-level classes using low-complexity solutions. For Task 1a, we propose a novel two-stage ASC system leveraging upon ad-hoc score combination of two convolutional neural networks (CNNs), classifying the acoustic input according to three classes, and then ten classes, respectively. Four different CNN-based architectures are explored to implement the two-stage classifiers, and several data augmentation techniques are also investigated. For Task 1b, we leverage upon a quantization method to reduce the complexity of two of our top-accuracy three-classes CNN-based architectures. On Task 1a development data set, an ASC accuracy of 76.9\% is attained using our best single classifier and data augmentation. An accuracy of 81.9\% is then attained by a final model fusion of our two-stage ASC classifiers. On Task 1b development data set, we achieve an accuracy of 96.7\% with a model size smaller than 500KB. Code is available: https://github.com/MihawkHu/DCASE2020_task1.