基于SNR的教师学生的语音增强技术

论文标题

基于SNR的教师学生的语音增强技术

SNR-Based Teachers-Student Technique for Speech Enhancement

论文作者

Hao, Xiang, Su, Xiangdong, Wang, Zhiyu, Zhang, Qiang, Xu, Huali, Gao, Guanglai

论文摘要

语音增强方法的同时，在高信噪比（SNR）和低SNR的情况下达到稳健的性能是非常具有挑战性的。在本文中，我们提出了一种整合基于SNR的教师学生技术和时间域U-NET来解决此问题的方法。具体而言，此方法由多个教师模型和一个学生模型组成。我们首先在多个小范围的SNR下训练教师模型，这些SNR与彼此之间不合时宜，以便它们可以在特定的SNR范围内很好地执行语音增强。然后，我们选择不同的教师模型来根据培训数据的SNR监督学生模型的培训。最终，学生模型可以在高SNR和低SNR下进行语音增强。为了评估所提出的方法，我们根据公共数据集构建了一个snR范围从-20db到20dB的数据集。我们通过实验分析了基于SNR的教师研究技术的有效性，并将所提出的方法与几种最新方法进行了比较。

It is very challenging for speech enhancement methods to achieves robust performance under both high signal-to-noise ratio (SNR) and low SNR simultaneously. In this paper, we propose a method that integrates an SNR-based teachers-student technique and time-domain U-Net to deal with this problem. Specifically, this method consists of multiple teacher models and a student model. We first train the teacher models under multiple small-range SNRs that do not coincide with each other so that they can perform speech enhancement well within the specific SNR range. Then, we choose different teacher models to supervise the training of the student model according to the SNR of the training data. Eventually, the student model can perform speech enhancement under both high SNR and low SNR. To evaluate the proposed method, we constructed a dataset with an SNR ranging from -20dB to 20dB based on the public dataset. We experimentally analyzed the effectiveness of the SNR-based teachers-student technique and compared the proposed method with several state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题