通过对比度学习的人声note事件注释清理数据

论文标题

通过对比度学习的人声note事件注释清理数据

Data Cleansing with Contrastive Learning for Vocal Note Event Annotations

论文作者

Meseguer-Brocal, Gabriel, Bittner, Rachel, Durand, Simon, Brost, Brian

论文摘要

数据清洁是一种精心研究的策略，用于清洁数据集中的错误标签，在音乐信息检索中尚未广泛采用。先前提出的数据清洁模型不考虑结构化（例如时间变化）标签，例如音乐数据共有的标签。我们为时间变化的结构化标签提出了一种新颖的数据清洁模型，该模型利用了标签的局部结构，并证明了其对音乐中的声音事件事件注释的有用性。％我们的模型通过自动创建可能正确标签的局部变形来以对比度学习方式进行培训。我们的模型以对比度学习方式进行了训练，该模型通过自动对比可能正确的标签与局部变形对形成鲜明对比。我们证明，与使用原始数据集训练的精度相比，使用我们建议的策略进行培训时，转录模型的准确性大大提高。此外，我们使用模型来估计DALI数据集中的注释错误率，并突出此类模型的其他潜在用途。

Data cleansing is a well studied strategy for cleaning erroneous labels in datasets, which has not yet been widely adopted in Music Information Retrieval. Previously proposed data cleansing models do not consider structured (e.g. time varying) labels, such as those common to music data. We propose a novel data cleansing model for time-varying, structured labels which exploits the local structure of the labels, and demonstrate its usefulness for vocal note event annotations in music. %Our model is trained in a contrastive learning manner by automatically creating local deformations of likely correct labels. Our model is trained in a contrastive learning manner by automatically contrasting likely correct labels pairs against local deformations of them. We demonstrate that the accuracy of a transcription model improves greatly when trained using our proposed strategy compared with the accuracy when trained using the original dataset. Additionally we use our model to estimate the annotation error rates in the DALI dataset, and highlight other potential uses for this type of model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题