论文标题
通过修剪彩票的超轻色深mir
Ultra-light deep MIR by trimming lottery tickets
论文作者
论文摘要
音乐信息检索的当前最新结果在很大程度上由深度学习方法主导。这些任务提供了前所未有的准确性。但是,这些模型一直被忽视的缺点是它们的巨大复杂性,这对于他们的成功似乎至关重要。在本文中,我们通过提出基于彩票假设的模型修剪方法来解决此问题。我们修改了原始方法,以通过整个单元的结构化修剪来明确删除参数,而不是简单地掩盖单个权重。这导致模型在大小,内存和操作数量方面有效地更轻。我们表明,我们的建议可以删除多达90%的模型参数,而不会丢失准确性,从而导致超轻质的深mir模型。我们证实了令人惊讶的结果,即在较小的压缩比(删除多达85%的网络)下,更轻的型号始终超过其较重的型号。我们在一系列MIR任务上展示了这些结果,包括音频分类,音高识别,和弦提取,鼓转录和发作估计。由此产生的MIR的超轻型深度学习模型可以在CPU上运行,甚至可以在精确度降低的嵌入式设备上适合。
Current state-of-the-art results in Music Information Retrieval are largely dominated by deep learning approaches. These provide unprecedented accuracy across all tasks. However, the consistently overlooked downside of these models is their stunningly massive complexity, which seems concomitantly crucial to their success. In this paper, we address this issue by proposing a model pruning method based on the lottery ticket hypothesis. We modify the original approach to allow for explicitly removing parameters, through structured trimming of entire units, instead of simply masking individual weights. This leads to models which are effectively lighter in terms of size, memory and number of operations. We show that our proposal can remove up to 90% of the model parameters without loss of accuracy, leading to ultra-light deep MIR models. We confirm the surprising result that, at smaller compression ratios (removing up to 85% of a network), lighter models consistently outperform their heavier counterparts. We exhibit these results on a large array of MIR tasks including audio classification, pitch recognition, chord extraction, drum transcription and onset estimation. The resulting ultra-light deep learning models for MIR can run on CPU, and can even fit on embedded devices with minimal degradation of accuracy.