论文标题
前景背景环境声音场景分离
Foreground-Background Ambient Sound Scene Separation
论文作者
论文摘要
环境声音场景通常包含在某种固定背景之上发生的多个短期事件。我们考虑将这些事件与背景区分开的任务,我们称之为前景环境环境场景分离。我们提出了一个基于深度学习的分离框架,采用合适的功能正常化方案和一个可选的辅助网络来捕获背景统计,我们研究了其处理在环境声音场景中遇到的各种声音类别的能力,而在训练中通常没有看到。为此,我们使用来自Desuset和Audioset数据集的孤立声音创建单渠道前景背景混合物,并以各种信号噪声比例以各种可见或看不见的声音类的混合物进行了广泛的实验。我们的实验发现证明了所提出的方法的概括能力。
Ambient sound scenes typically comprise multiple short events occurring on top of a somewhat stationary background. We consider the task of separating these events from the background, which we call foreground-background ambient sound scene separation. We propose a deep learning-based separation framework with a suitable feature normaliza-tion scheme and an optional auxiliary network capturing the background statistics, and we investigate its ability to handle the great variety of sound classes encountered in ambient sound scenes, which have often not been seen in training. To do so, we create single-channel foreground-background mixtures using isolated sounds from the DESED and Audioset datasets, and we conduct extensive experiments with mixtures of seen or unseen sound classes at various signal-to-noise ratios. Our experimental findings demonstrate the generalization ability of the proposed approach.