论文标题
使用信息流建模的异质知识蒸馏
Heterogeneous Knowledge Distillation using Information Flow Modeling
论文作者
论文摘要
知识蒸馏(KD)方法能够将大型且复杂的教师编码的知识转移到一个较小,更快的学生中。早期方法通常仅限于仅在网络的最后一层之间传输知识,而后者的方法能够执行多层KD,从而进一步提高了学生的准确性。但是,尽管其性能提高,但这些方法仍然遭受了限制其效率和灵活性的几种局限性。首先,现有的KD方法通常会忽略该神经网络在培训过程中通过不同的学习阶段经历,这通常需要对每个学习的类型的监督。此外,现有的多层KD方法通常无法有效处理具有显着不同架构(异质KD)的网络。在本文中,我们提出了一种新颖的KD方法,该方法通过对信息流过教师模型的各个层次进行建模,然后训练学生模型来模仿此信息流。所提出的方法能够通过在培训过程的不同阶段使用适当的监督计划以及设计和培训适当的辅助教师模型,以克服上述限制,该模型可以用作能够“解释”教师对学生工作方式的代理模型。使用四个图像数据集和几个不同的评估设置证明了所提出方法的有效性。
Knowledge Distillation (KD) methods are capable of transferring the knowledge encoded in a large and complex teacher into a smaller and faster student. Early methods were usually limited to transferring the knowledge only between the last layers of the networks, while latter approaches were capable of performing multi-layer KD, further increasing the accuracy of the student. However, despite their improved performance, these methods still suffer from several limitations that restrict both their efficiency and flexibility. First, existing KD methods typically ignore that neural networks undergo through different learning phases during the training process, which often requires different types of supervision for each one. Furthermore, existing multi-layer KD methods are usually unable to effectively handle networks with significantly different architectures (heterogeneous KD). In this paper we propose a novel KD method that works by modeling the information flow through the various layers of the teacher model and then train a student model to mimic this information flow. The proposed method is capable of overcoming the aforementioned limitations by using an appropriate supervision scheme during the different phases of the training process, as well as by designing and training an appropriate auxiliary teacher model that acts as a proxy model capable of "explaining" the way the teacher works to the student. The effectiveness of the proposed method is demonstrated using four image datasets and several different evaluation setups.