Storir：随机房间的冲动响应生成，用于音频数据增强

论文标题

Storir：随机房间的冲动响应生成，用于音频数据增强

StoRIR: Stochastic Room Impulse Response Generation for Audio Data Augmentation

论文作者

Masztalski, Piotr, Matuszewski, Mateusz, Piaskowski, Karol, Romaniuk, Michał

论文摘要

在本文中，我们介绍了Storir-一种随机房间脉冲响应生成方法，该方法专门用于机器学习应用中的音频数据增强。与图像源或射线追踪等几何方法相反，该技术不需要事先定义房间的几何形状，吸收系数或麦克风和源位置，并且仅取决于房间的声学参数。该方法是直观的，易于实现的，并允许生成非常复杂的围栏的RIR。我们表明，当使用传统的图像源方法时，Storir在语音增强任务中用于语音增强任务中的音频数据增强时，可以在广泛的指标上获得更好的结果，从而有效地将其中的许多方法提高了5％以上。我们在线发布Storir的Python实施

In this paper we introduce StoRIR - a stochastic room impulse response generation method dedicated to audio data augmentation in machine learning applications. This technique, in contrary to geometrical methods like image-source or ray tracing, does not require prior definition of room geometry, absorption coefficients or microphone and source placement and is dependent solely on the acoustic parameters of the room. The method is intuitive, easy to implement and allows to generate RIRs of very complicated enclosures. We show that StoRIR, when used for audio data augmentation in a speech enhancement task, allows deep learning models to achieve better results on a wide range of metrics than when using the conventional image-source method, effectively improving many of them by more than 5 %. We publish a Python implementation of StoRIR online

下载PDF全文

下载文献需遵守相关版权规定

论文标题