使用深层生成模型进行概率的音频源分离的问题

论文标题

使用深层生成模型进行概率的音频源分离的问题

Problems using deep generative models for probabilistic audio source separation

论文作者

Frank, Maurice, Ilse, Maximilian

论文摘要

深度生成建模的最新进展使得可以从复杂数据中学习先前的分布，随后可用于贝叶斯推断。但是，我们发现，通过深层生成模型为音频信号学到的分布并未表现出使用概率方法等任务等任务所需的正确属性。我们观察到，所学的先前分布要么是歧视性的，而且具有极高的峰值或平滑和非歧视性。我们为两个音频数据集上两种类型的深层生成模型量化了此行为。

Recent advancements in deep generative modeling make it possible to learn prior distributions from complex data that subsequently can be used for Bayesian inference. However, we find that distributions learned by deep generative models for audio signals do not exhibit the right properties that are necessary for tasks like audio source separation using a probabilistic approach. We observe that the learned prior distributions are either discriminative and extremely peaked or smooth and non-discriminative. We quantify this behavior for two types of deep generative models on two audio datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题