学习为动态早期验证网络加权样本

论文标题

学习为动态早期验证网络加权样本

Learning to Weight Samples for Dynamic Early-exiting Networks

论文作者

Han, Yizeng, Pu, Yifan, Lai, Zihang, Wang, Chaofei, Song, Shiji, Cao, Junfen, Huang, Wenhui, Deng, Chao, Huang, Gao

论文摘要

早期退出是提高深网推理效率的有效范式。通过构造具有不同资源需求的分类器（退出），此类网络可以在早期出口处输出简单的样本，从而消除了执行更深层的需求。尽管现有作品主要关注多EXIT网络的建筑设计，但此类模型的培训策略在很大程度上没有探索。当前的最新模型在培训期间对所有样品进行了相同的处理。但是，在测试过程中的早期外观行为被忽略了，从而导致训练和测试之间存在差距。在本文中，我们建议通过样品加权来弥合这一差距。从直觉上讲，简单的样本通常在推理期间在网络早期退出，应该为培训早期分类器提供更多贡献。但是，晚期分类器应强调硬样品的培训（主要是从更深层退出）。我们的工作建议采用一个体重预测网络，以加重每个出口处不同训练样本的损失。这个重量预测网络和骨干模型在具有新颖的优化目标的元学习框架下共同优化。通过将推断期间的自适应行为带入训练阶段，我们表明拟议的加权机制始终提高分类准确性和推理效率之间的权衡。代码可在https://github.com/leaplabthu/l2w-den上找到。

Early exiting is an effective paradigm for improving the inference efficiency of deep networks. By constructing classifiers with varying resource demands (the exits), such networks allow easy samples to be output at early exits, removing the need for executing deeper layers. While existing works mainly focus on the architectural design of multi-exit networks, the training strategies for such models are largely left unexplored. The current state-of-the-art models treat all samples the same during training. However, the early-exiting behavior during testing has been ignored, leading to a gap between training and testing. In this paper, we propose to bridge this gap by sample weighting. Intuitively, easy samples, which generally exit early in the network during inference, should contribute more to training early classifiers. The training of hard samples (mostly exit from deeper layers), however, should be emphasized by the late classifiers. Our work proposes to adopt a weight prediction network to weight the loss of different training samples at each exit. This weight prediction network and the backbone model are jointly optimized under a meta-learning framework with a novel optimization objective. By bringing the adaptive behavior during inference into the training phase, we show that the proposed weighting mechanism consistently improves the trade-off between classification accuracy and inference efficiency. Code is available at https://github.com/LeapLabTHU/L2W-DEN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题