论文标题
BYTECOVER:通过多损失训练的封面识别
ByteCover: Cover Song Identification via Multi-Loss Training
论文作者
论文摘要
我们在本文Bytecover中介绍,这是一种用于封面歌曲识别(CSI)的新功能学习方法。 ByteCover是基于经典的Resnet模型而构建的,并且设计了两个重大改进,以进一步增强该模型的CSI能力。在第一个改进中,我们介绍了实例归一化(IN)和批处理(BN)的集成,以构建IBN块,这是我们Resnet-IBN模型的主要组成部分。在IBN块的帮助下,我们的CSI模型可以学习对音乐属性的变化(例如Key,Tempo,Timbre和类型)不变的功能,同时保留版本信息。在第二个改进中,我们采用BNNECK方法允许进行多损失训练,并鼓励我们的方法共同优化分类损失和三胞胎损失,并且可以同时确保盖上的阶层歧视和覆盖歌曲的阶级歧视和阶级的紧凑性。一组实验证明了在多个数据集上Bytecover的有效性和效率,在DA-TACOS数据集中,Bytecover的表现优于最佳竞争系统20.9 \%。
We present in this paper ByteCover, which is a new feature learning method for cover song identification (CSI). ByteCover is built based on the classical ResNet model, and two major improvements are designed to further enhance the capability of the model for CSI. In the first improvement, we introduce the integration of instance normalization (IN) and batch normalization (BN) to build IBN blocks, which are major components of our ResNet-IBN model. With the help of the IBN blocks, our CSI model can learn features that are invariant to the changes of musical attributes such as key, tempo, timbre and genre, while preserving the version information. In the second improvement, we employ the BNNeck method to allow a multi-loss training and encourage our method to jointly optimize a classification loss and a triplet loss, and by this means, the inter-class discrimination and intra-class compactness of cover songs, can be ensured at the same time. A set of experiments demonstrated the effectiveness and efficiency of ByteCover on multiple datasets, and in the Da-TACOS dataset, ByteCover outperformed the best competitive system by 20.9\%.