立体声音回声取消的深层复杂的多帧过滤网络

论文标题

立体声音回声取消的深层复杂的多帧过滤网络

A deep complex multi-frame filtering network for stereophonic acoustic echo cancellation

论文作者

Cheng, Linjuan, Zheng, Chengshi, Li, Andong, Wu, Yuquan, Peng, Renhua, Li, Xiaodong

论文摘要

在免提沟通系统中，扬声器和麦克风之间的耦合会产生回声信号，从而严重影响通信的质量。同时，交流环境中各种类型的噪声进一步降低了语音质量和清晰度。很难在一个步骤内从麦克风信号中提取近端信号，尤其是在低信噪比的情况下。在本文中，我们提出了一种深厚的复杂网络方法来解决此问题。特别是，我们将立体声音回声取消分解为两个阶段，包括线性立体声音回声取消模块和残留回声抑制模块，这两个模块都基于深度学习架构。引入了多帧过滤策略，以通过捕获更多框架间信息来使线性回波的估计受益。此外，我们将复杂的光谱映射分解为幅度估计和复杂的光谱细化。实验结果表明，在各种条件下，我们提出的方法在先前的晚期算法上实现了阶段的性能。

In hands-free communication system, the coupling between loudspeaker and microphone generates echo signal, which can severely influence the quality of communication. Meanwhile, various types of noise in communication environments further reduce speech quality and intelligibility. It is difficult to extract the near-end signal from the microphone signal within one step, especially in low signal-to-noise ratio scenarios. In this paper, we propose a deep complex network approach to address this issue. Specially, we decompose the stereophonic acoustic echo cancellation into two stages, including linear stereophonic acoustic echo cancellation module and residual echo suppression module, where both modules are based on deep learning architectures. A multi-frame filtering strategy is introduced to benefit the estimation of linear echo by capturing more inter-frame information. Moreover, we decouple the complex spectral mapping into magnitude estimation and complex spectrum refinement. Experimental results demonstrate that our proposed approach achieves stage-of-the-art performance over previous advanced algorithms under various conditions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题