以评估为导向的知识蒸馏以进行深度识别

论文标题

以评估为导向的知识蒸馏以进行深度识别

Evaluation-oriented Knowledge Distillation for Deep Face Recognition

论文作者

Huang, Yuge, Wu, Jiaxiang, Xu, Xingkun, Ding, Shouhong

论文摘要

知识蒸馏（KD）是一种广泛使用的技术，它利用大型网络来提高紧凑型模型的性能。以前的KD方法通常旨在指导学生在表示空间中完全模仿教师的行为。但是，这种一对一的相应约束可能会导致从老师到学生，尤其是模型能力低的学生的知识转移。受KD方法的最终目标的启发，我们提出了一种新颖的面向评估的KD方法（EKD），以直接降低培训期间教师和学生模型之间的绩效差距。具体而言，我们在面部识别中采用常用的评估指标，即假阳性率（FPR）和真实正率（TPR）作为绩效指标。根据评估协议，选择了导致教师和学生模型之间TPR和FPR差异的关键对关系。然后，学生的批判关系被限制在教师中通过新颖的基于等级的损失功能近似相应的关系，从而为能力较低的学生提供了更大的灵活性。对流行基准测试的广泛实验结果表明，我们的EKD优于最先进的竞争对手。

Knowledge distillation (KD) is a widely-used technique that utilizes large networks to improve the performance of compact models. Previous KD approaches usually aim to guide the student to mimic the teacher's behavior completely in the representation space. However, such one-to-one corresponding constraints may lead to inflexible knowledge transfer from the teacher to the student, especially those with low model capacities. Inspired by the ultimate goal of KD methods, we propose a novel Evaluation oriented KD method (EKD) for deep face recognition to directly reduce the performance gap between the teacher and student models during training. Specifically, we adopt the commonly used evaluation metrics in face recognition, i.e., False Positive Rate (FPR) and True Positive Rate (TPR) as the performance indicator. According to the evaluation protocol, the critical pair relations that cause the TPR and FPR difference between the teacher and student models are selected. Then, the critical relations in the student are constrained to approximate the corresponding ones in the teacher by a novel rank-based loss function, giving more flexibility to the student with low capacity. Extensive experimental results on popular benchmarks demonstrate the superiority of our EKD over state-of-the-art competitors.

下载PDF全文

下载文献需遵守相关版权规定

论文标题