定制与prêt-à-porter彩票：利用面具相似性来训练

论文标题

定制与prêt-à-porter彩票：利用面具相似性来训练

Bespoke vs. Prêt-à-Porter Lottery Tickets: Exploiting Mask Similarity for Trainable Sub-Network Finding

论文作者

Paganini, Michela, Forde, Jessica Zosa

论文摘要

在过度参数化网络中观察可稀疏的可训练子网络（也称为彩票（LTS）），引起了人们对其训练性，扩展，唯一性和概括性能的询问。在图像分类任务和体系结构的28个组合中，我们发现了通过不同的迭代修剪技术发现的LTS连接结构的差异，从而否认了它们的独特性，并将新兴的遮罩结构连接到修剪的选择。此外，我们提出了一种基于共识的方法来生成精制的彩票。这个彩票票证降级程序是基于以下原则：在不同任务上总是不遵循的参数更可靠地识别重要的子网络，能够以令人尴尬的并行方式选择有意义的体系结构，同时快速丢弃额外的参数，而无需进一步修剪迭代。我们成功地训练了这些子网络，以与普通彩票的绩效相当。

The observation of sparse trainable sub-networks within over-parametrized networks - also known as Lottery Tickets (LTs) - has prompted inquiries around their trainability, scaling, uniqueness, and generalization properties. Across 28 combinations of image classification tasks and architectures, we discover differences in the connectivity structure of LTs found through different iterative pruning techniques, thus disproving their uniqueness and connecting emergent mask structure to the choice of pruning. In addition, we propose a consensus-based method for generating refined lottery tickets. This lottery ticket denoising procedure, based on the principle that parameters that always go unpruned across different tasks more reliably identify important sub-networks, is capable of selecting a meaningful portion of the architecture in an embarrassingly parallel way, while quickly discarding extra parameters without the need for further pruning iterations. We successfully train these sub-networks to performance comparable to that of ordinary lottery tickets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题