可控音频综合表达钢琴性能的生成建模

论文标题

可控音频综合表达钢琴性能的生成建模

Generative Modelling for Controllable Audio Synthesis of Expressive Piano Performance

论文作者

Tan, Hao Hao, Luo, Yin-Jyun, Herremans, Dorien

论文摘要

我们提出了基于高斯混合物变化自动编码器（GM-VAE）的可控神经音频合成器，该合成器可以在音频域中产生逼真的钢琴表演，该钢琴域中的钢琴表演紧密遵循钢琴表演的两种基本样式特征的时间条件：表演和动力学。我们演示了该模型如何在整合音频的过程中应用细粒的样式变形。这是基于可以从先前或从其他部件推断出的潜在变量的条件。设想的用例之一是激发现有钢琴音乐作品的创意和全新解释。

We present a controllable neural audio synthesizer based on Gaussian Mixture Variational Autoencoders (GM-VAE), which can generate realistic piano performances in the audio domain that closely follows temporal conditions of two essential style features for piano performances: articulation and dynamics. We demonstrate how the model is able to apply fine-grained style morphing over the course of synthesizing the audio. This is based on conditions which are latent variables that can be sampled from the prior or inferred from other pieces. One of the envisioned use cases is to inspire creative and brand new interpretations for existing pieces of piano music.

下载PDF全文

下载文献需遵守相关版权规定

论文标题