论文标题

基于带有潜在和弦标签的变异自动编码器的半监督神经和弦估计

Semi-supervised Neural Chord Estimation Based on a Variational Autoencoder with Latent Chord Labels and Features

论文作者

Wu, Yiming, Carsault, Tristan, Nakamura, Eita, Yoshii, Kazuyoshi

论文摘要

本文介绍了一种统计原理的半监督自动和弦估计方法(ACE),该方法可以有效利用音乐信号,而不管和弦注释的可用性如何。 ACE的典型方法是仅使用带注释的音乐信号以监督的方式训练深层分类模型(神经和弦估计器)。在这种歧视方法中,几乎没有考虑到有关和弦标签序列(模型输出)的先验知识。相比之下,我们在摊销的变异推理框架中提出了一种统一的生成和歧视方法。更具体地说,我们制定了一个深层生成模型,该模型代表了来自离散标签和连续特征(潜在变量)的色度向量(观察到的变量)的生成过程,该模型被认为遵循了Markov模型,该模型分别偏爱自我转变和标准的高斯分布。给定色素向量作为观察到的数据,分别通过使用深层分类和识别模型来大致计算潜在标签和特征的后验分布。这三个模型构成了各种自动编码器,可以半监督的方式共同训练。实验结果表明,基于和弦标签的马尔可夫先验的分类模型的正则化和色度矢量的生成模型甚至在监督条件下也提高了ACE的性能。使用其他未经许可数据的半监督学习可以进一步提高性能。

This paper describes a statistically-principled semi-supervised method of automatic chord estimation (ACE) that can make effective use of music signals regardless of the availability of chord annotations. The typical approach to ACE is to train a deep classification model (neural chord estimator) in a supervised manner by using only annotated music signals. In this discriminative approach, prior knowledge about chord label sequences (model output) has scarcely been taken into account. In contrast, we propose a unified generative and discriminative approach in the framework of amortized variational inference. More specifically, we formulate a deep generative model that represents the generative process of chroma vectors (observed variables) from discrete labels and continuous features (latent variables), which are assumed to follow a Markov model favoring self-transitions and a standard Gaussian distribution, respectively. Given chroma vectors as observed data, the posterior distributions of the latent labels and features are computed approximately by using deep classification and recognition models, respectively. These three models form a variational autoencoder and can be trained jointly in a semi-supervised manner. The experimental results show that the regularization of the classification model based on the Markov prior of chord labels and the generative model of chroma vectors improved the performance of ACE even under the supervised condition. The semi-supervised learning using additional non-annotated data can further improve the performance.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源