知情的源提取，并应用于声学回声降低

论文标题

知情的源提取，并应用于声学回声降低

Informed Source Extraction With Application to Acoustic Echo Reduction

论文作者

Elminshawi, Mohamed, Mack, Wolfgang, Habets, Emanuël A. P.

论文摘要

知情的说话者提取旨在从有关所需说话者的先验知识中从源头的混合中提取目标语音信号。最近基于深度学习的方法利用了扬声器判别模型，该模型将目标扬声器说出的参考摘要映射到封装目标扬声器特征的单个嵌入向量中。但是，这种建模故意忽略了参考信号的时变特性。在这项工作中，我们假设有一个参考信号在时间上与目标信号相关。要考虑到这种相关性，我们提出了一个随时间变化的源判别模型，该模型捕获了参考信号的时间动力学。我们还表明，现有的方法和所提出的方法也可以推广到非语言来源。实验结果表明，当在声学回声减少方案中应用时，提出的方法可显着提高提取性能。

Informed speaker extraction aims to extract a target speech signal from a mixture of sources given prior knowledge about the desired speaker. Recent deep learning-based methods leverage a speaker discriminative model that maps a reference snippet uttered by the target speaker into a single embedding vector that encapsulates the characteristics of the target speaker. However, such modeling deliberately neglects the time-varying properties of the reference signal. In this work, we assume that a reference signal is available that is temporally correlated with the target signal. To take this correlation into account, we propose a time-varying source discriminative model that captures the temporal dynamics of the reference signal. We also show that existing methods and the proposed method can be generalized to non-speech sources as well. Experimental results demonstrate that the proposed method significantly improves the extraction performance when applied in an acoustic echo reduction scenario.

下载PDF全文

下载文献需遵守相关版权规定

论文标题