教师级网络：神经网络压缩机制

论文标题

教师级网络：神经网络压缩机制

Teacher-Class Network: A Neural Network Compression Mechanism

论文作者

Malik, Shaiq Munir, Haider, Muhammad Umair, Tharani, Mohbat, Rasheed, Musab, Taj, Murtaza

论文摘要

为了减少深神经网络（DNN）的压倒性大小，教师学生的方法论试图将知识从复杂的教师网络转移到简单的学生网络。相反，我们提出了一种名为“教师级网络”的新颖方法，该方法由一个教师和多个学生网络（即学生班）组成。所提出的方法不仅将知识转移给一个学生，而是向每个学生传递了一大批知识。我们的学生未接受针对特定问题的逻辑的培训，他们接受了教师网络学到的模仿知识（密集表示）的培训，因此，学生阶级所学的合并知识也可以用于解决其他问题。提出的教师级体系结构在多个基准数据集上进行了评估，例如MNIST，时尚MNIST，IMDB电影评论，Camvid，Cifar-10和Imagenet，以及包括图像分类，情感分类和细分的多个任务。我们的方法在准确性和计算成本方面超过了单一学生的方法，同时减少了10-30倍的参数。

To reduce the overwhelming size of Deep Neural Networks (DNN) teacher-student methodology tries to transfer knowledge from a complex teacher network to a simple student network. We instead propose a novel method called the teacher-class network consisting of a single teacher and multiple student networks (i.e. class of students). Instead of transferring knowledge to one student only, the proposed method transfers a chunk of knowledge to each student. Our students are not trained for problem-specific logits, they are trained to mimic knowledge (dense representation) learned by the teacher network thus the combined knowledge learned by the class of students can be used to solve other problems as well. The proposed teacher-class architecture is evaluated on several benchmark datasets such as MNIST, Fashion MNIST, IMDB Movie Reviews, CAMVid, CIFAR-10 and ImageNet on multiple tasks including image classification, sentiment classification and segmentation. Our approach outperforms the state of-the-art single student approach in terms of accuracy as well as computational cost while achieving 10-30 times reduction in parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题