论文标题

使用卷积神经网络从频率滑落的广义互相关的到达估计的时间差异

Time Difference of Arrival Estimation from Frequency-Sliding Generalized Cross-Correlations Using Convolutional Neural Networks

论文作者

Comanducci, Luca, Cobos, Maximo, Antonacci, Fabio, Sarti, Augusto

论文摘要

在过去的几年中,对解决传统信号处理任务的深度学习方法的兴趣一直在稳步增长。在不利方案中,时间延迟估计(TDE)是一个具有挑战性的问题,几十年来,基于广义互相关(GCC)的经典方法已被广泛使用。最近,提出了基于跨力频谱阶段的子带分析,提出了频率的GCC(FS-GCC)作为TDE的新技术,提供了对不同频带中包含的时间延迟信息的结构化二维表示。灵感来自基于深度学习的图像剥夺解决方案,我们在本文中建议使用卷积神经网络(CNN)学习在不良声条件下提取的FS-GCC中包含的时间延迟模式。我们的实验证实,所提出的方法提供了出色的TDE性能,同时能够推广到不同的房间和传感器设置。

The interest in deep learning methods for solving traditional signal processing tasks has been steadily growing in the last years. Time delay estimation (TDE) in adverse scenarios is a challenging problem, where classical approaches based on generalized cross-correlations (GCCs) have been widely used for decades. Recently, the frequency-sliding GCC (FS-GCC) was proposed as a novel technique for TDE based on a sub-band analysis of the cross-power spectrum phase, providing a structured two-dimensional representation of the time delay information contained across different frequency bands. Inspired by deep-learning-based image denoising solutions, we propose in this paper the use of convolutional neural networks (CNNs) to learn the time-delay patterns contained in FS-GCCs extracted in adverse acoustic conditions. Our experiments confirm that the proposed approach provides excellent TDE performance while being able to generalize to different room and sensor setups.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源