关于使用深神经网络进行音乐带宽扩展的滤波器概括

论文标题

关于使用深神经网络进行音乐带宽扩展的滤波器概括

On Filter Generalization for Music Bandwidth Extension Using Deep Neural Networks

论文作者

Sulun, Serkan, Davies, Matthew E. P.

论文摘要

在本文中，我们讨论了音频增强的广泛领域的子主题，即音乐音频带宽扩展。我们使用深神经网络制定带宽扩展问题，其中提供带有带限的信号作为网络的输入，目的是重建全面的宽度输出。我们的主要贡献集中在训练和随后测试网络时选择低通滤波器的影响。对于两种不同的最深层体系结构，Resnet和U-NET，我们证明，当训练和测试过滤器匹配时，可以获得高达7dB的信噪比（SNR）的改善。但是，当这些过滤器有所不同时，改进会大大下降，在某些训练条件下，SNR的速度低于带限值的输入。为了避免这种明显的过滤形状，我们提出了一种数据增强策略，该策略在训练过程中利用多个低通滤波器，并导致在测试时改善概括，从而无法看到过滤条件。

In this paper, we address a sub-topic of the broad domain of audio enhancement, namely musical audio bandwidth extension. We formulate the bandwidth extension problem using deep neural networks, where a band-limited signal is provided as input to the network, with the goal of reconstructing a full-bandwidth output. Our main contribution centers on the impact of the choice of low pass filter when training and subsequently testing the network. For two different state of the art deep architectures, ResNet and U-Net, we demonstrate that when the training and testing filters are matched, improvements in signal-to-noise ratio (SNR) of up to 7dB can be obtained. However, when these filters differ, the improvement falls considerably and under some training conditions results in a lower SNR than the band-limited input. To circumvent this apparent overfitting to filter shape, we propose a data augmentation strategy which utilizes multiple low pass filters during training and leads to improved generalization to unseen filtering conditions at test time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题