图像检索的数据有效排名蒸馏

论文标题

图像检索的数据有效排名蒸馏

Data-Efficient Ranking Distillation for Image Retrieval

论文作者

Laskar, Zakaria, Kannala, Juho

论文摘要

深度学习的最新进展导致图像检索领域的快速发展。但是，最佳性能架构会产生巨大的计算成本。最近的方法使用知识蒸馏解决了这个问题，将知识从更深，更重的架构转移到更小得多的网络。在本文中，我们解决了用于度量学习问题的知识蒸馏。与以前的方法不同，我们提出的方法共同解决以下约束i）对教师模型的查询有限，ii）具有访问最终输出表示形式的黑匣子教师模型，以及iii）一小部分原始培训数据，而没有任何地面真实标签。此外，蒸馏方法不需要学生和老师具有相同的维度。解决这些限制会减少计算要求，依赖大规模培训数据集，并解决有限或部分访问私人数据（例如教师模型或相应的培训数据/标签）的实际情况。关键思想是通过在最终输出表示空间中执行线性插值来增强原始训练集。然后在原始和增强教师学生样本表示的关节空间中进行蒸馏。结果表明，我们的方法可以匹配接受全面监督训练的基线模型。在较低的培训样本设置中，我们的方法在两个具有挑战性的图像检索数据集（Roxford5k和rparis6k \ cite {roxf}的情况下，都超过了完全有监督的方法，而教师可能会监督。

Recent advances in deep learning has lead to rapid developments in the field of image retrieval. However, the best performing architectures incur significant computational cost. Recent approaches tackle this issue using knowledge distillation to transfer knowledge from a deeper and heavier architecture to a much smaller network. In this paper we address knowledge distillation for metric learning problems. Unlike previous approaches, our proposed method jointly addresses the following constraints i) limited queries to teacher model, ii) black box teacher model with access to the final output representation, and iii) small fraction of original training data without any ground-truth labels. In addition, the distillation method does not require the student and teacher to have same dimensionality. Addressing these constraints reduces computation requirements, dependency on large-scale training datasets and addresses practical scenarios of limited or partial access to private data such as teacher models or the corresponding training data/labels. The key idea is to augment the original training set with additional samples by performing linear interpolation in the final output representation space. Distillation is then performed in the joint space of original and augmented teacher-student sample representations. Results demonstrate that our approach can match baseline models trained with full supervision. In low training sample settings, our approach outperforms the fully supervised approach on two challenging image retrieval datasets, ROxford5k and RParis6k \cite{Roxf} with the least possible teacher supervision.

下载PDF全文

下载文献需遵守相关版权规定

论文标题