论文标题
DeepChorus:合唱检测的多尺度卷积和自我注意的混合模型
DEEPCHORUS: A Hybrid Model of Multi-scale Convolution and Self-attention for Chorus Detection
论文作者
论文摘要
合唱的检测是音乐信号处理中的一个具有挑战性的问题,因为合唱通常在流行歌曲中不止一次重复一次,通常具有丰富的乐器和复杂的节奏形式。大多数现有作品都集中在基于某些明确特征(例如响度和发生频率)的合唱部分的接受度上。这些合唱的预称限制了这些方法的概括能力,从而对其他重复的部分(例如经文)造成误导。为了解决问题,在本文中,我们提出了一个端到端合唱检测模型DeepChorus,减少了工程工作和对先验知识的需求。提出的模型包括两个主要结构:i)一个多尺度网络,用于得出合唱节的初步表示,ii)一个自我发项式卷积网络,以将特征进一步处理为代表合唱的概率曲线。为了获得最终结果,我们应用一个自适应阈值来对原始曲线进行二进制。实验结果表明,在大多数情况下,DeepChorus的表现优于现有的最新方法。
Chorus detection is a challenging problem in musical signal processing as the chorus often repeats more than once in popular songs, usually with rich instruments and complex rhythm forms. Most of the existing works focus on the receptiveness of chorus sections based on some explicit features such as loudness and occurrence frequency. These pre-assumptions for chorus limit the generalization capacity of these methods, causing misdetection on other repeated sections such as verse. To solve the problem, in this paper we propose an end-to-end chorus detection model DeepChorus, reducing the engineering effort and the need for prior knowledge. The proposed model includes two main structures: i) a Multi-Scale Network to derive preliminary representations of chorus segments, and ii) a Self-Attention Convolution Network to further process the features into probability curves representing chorus presence. To obtain the final results, we apply an adaptive threshold to binarize the original curve. The experimental results show that DeepChorus outperforms existing state-of-the-art methods in most cases.