论文标题
使用卷积神经网络从频率滑落的广义互相关的到达估计的时间差异
Time Difference of Arrival Estimation from Frequency-Sliding Generalized Cross-Correlations Using Convolutional Neural Networks
论文作者
论文摘要
在过去的几年中,对解决传统信号处理任务的深度学习方法的兴趣一直在稳步增长。在不利方案中,时间延迟估计(TDE)是一个具有挑战性的问题,几十年来,基于广义互相关(GCC)的经典方法已被广泛使用。最近,提出了基于跨力频谱阶段的子带分析,提出了频率的GCC(FS-GCC)作为TDE的新技术,提供了对不同频带中包含的时间延迟信息的结构化二维表示。灵感来自基于深度学习的图像剥夺解决方案,我们在本文中建议使用卷积神经网络(CNN)学习在不良声条件下提取的FS-GCC中包含的时间延迟模式。我们的实验证实,所提出的方法提供了出色的TDE性能,同时能够推广到不同的房间和传感器设置。
The interest in deep learning methods for solving traditional signal processing tasks has been steadily growing in the last years. Time delay estimation (TDE) in adverse scenarios is a challenging problem, where classical approaches based on generalized cross-correlations (GCCs) have been widely used for decades. Recently, the frequency-sliding GCC (FS-GCC) was proposed as a novel technique for TDE based on a sub-band analysis of the cross-power spectrum phase, providing a structured two-dimensional representation of the time delay information contained across different frequency bands. Inspired by deep-learning-based image denoising solutions, we propose in this paper the use of convolutional neural networks (CNNs) to learn the time-delay patterns contained in FS-GCCs extracted in adverse acoustic conditions. Our experiments confirm that the proposed approach provides excellent TDE performance while being able to generalize to different room and sensor setups.