论文标题
链接数十万和IMSLP数据集
Towards Linking the Lakh and IMSLP Datasets
论文作者
论文摘要
本文调查了将MIDI文件与大型钢琴乐谱图像数据库匹配的问题。以前的表格审计和中部对准方法主要集中于1比1的对齐任务,这不是从大型数据库中检索的可扩展解决方案。我们提出了一种可扩展跨模式检索的方法,该方法可能用于将数十万个MIDI数据集与IMSLP乐谱数据联系起来。我们的方法是修改以前提出的特征表示,称为符号盗版得分,以适合哈希。在一个包含55,000个单独的音乐图像的5,000个钢琴得分的数据库中,我们的系统的平均相互等级为0.84,平均检索时间为25.4秒。
This paper investigates the problem of matching a MIDI file against a large database of piano sheet music images. Previous sheet-audio and sheet-MIDI alignment approaches have primarily focused on a 1-to-1 alignment task, which is not a scalable solution for retrieval from large databases. We propose a method for scalable cross-modal retrieval that might be used to link the Lakh MIDI dataset with IMSLP sheet music data. Our approach is to modify a previously proposed feature representation called a symbolic bootleg score to be suitable for hashing. On a database of 5,000 piano scores containing 55,000 individual sheet music images, our system achieves a mean reciprocal rank of 0.84 and an average retrieval time of 25.4 seconds.