论文标题

Behm-gan:使用生成对抗网络的历史音乐的带宽扩展

BEHM-GAN: Bandwidth Extension of Historical Music using Generative Adversarial Networks

论文作者

Moliner, Eloi, Välimäki, Vesa

论文摘要

音频带宽扩展旨在扩大窄带音频信号的频谱。尽管该主题在近年来已经进行了广泛的研究,但扩展历史音乐录音带宽的特殊问题仍然是一个开放的挑战。本文提出了基于生成对抗网络的模型Behm-Gan,作为解决此问题的实际解决方案。所提出的方法可与音频的复杂频谱图表示,并且由于专门的正则化策略,可以有效地扩展分布外的真实历史记录的带宽。 Behm-Gan旨在将记录抑制任何添加剂干扰(例如点击和背景噪声)后,将其用作第二步。我们使用独奏钢琴古典音乐训练和评估该方法。所提出的方法在客观和主观实验中都优于比较基准。正式的盲目听力测试的结果表明,Behm-GAN显着提高了20世纪初期的留声机录音中的感知声音质量。对于几个项目,通过提出的带宽扩展算法增强历史记录后,平均意见分数有了很大的改善。这项研究代表了在现实世界情景下以数据驱动音乐恢复的相关步骤。

Audio bandwidth extension aims to expand the spectrum of narrow-band audio signals. Although this topic has been broadly studied during recent years, the particular problem of extending the bandwidth of historical music recordings remains an open challenge. This paper proposes BEHM-GAN, a model based on generative adversarial networks, as a practical solution to this problem. The proposed method works with the complex spectrogram representation of audio and, thanks to a dedicated regularization strategy, can effectively extend the bandwidth of out-of-distribution real historical recordings. The BEHM-GAN is designed to be applied as a second step after denoising the recording to suppress any additive disturbances, such as clicks and background noise. We train and evaluate the method using solo piano classical music. The proposed method outperforms the compared baselines in both objective and subjective experiments. The results of a formal blind listening test show that BEHM-GAN significantly increases the perceptual sound quality in early-20th-century gramophone recordings. For several items, there is a substantial improvement in the mean opinion score after enhancing historical recordings with the proposed bandwidth-extension algorithm. This study represents a relevant step toward data-driven music restoration in real-world scenarios.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源