跨模式中心损失

论文标题

跨模式中心损失

Cross-modal Center Loss

论文作者

Jing, Longlong, Vahdani, Elahe, Tan, Jiaxing, Tian, Yingli

论文摘要

跨模式检索旨在学习来自不同模式的数据的歧视性和模态不变特征。与通常从离线网络提取的功能中学习的现有方法不同，在本文中，我们提出了一种使用元数据共同训练跨模式检索框架组件的方法，并使网络能够找到最佳功能。提出的端到端框架通过三个损失函数进行了更新：1）一种新型的跨模式中心损失，以消除跨模式差异，2）跨层损失以最大程度地提高阶层间变化，而3）均值异常损失损失以降低模态变化。特别是，我们提出的跨模式中心损失最大程度地减少了各种方式属于同一类的物体的特征距离。已经对多模式的检索任务进行了广泛的实验，包括2D图像，3D点云和网格数据。所提出的框架大大优于ModelNet40数据集上的最新方法。

Cross-modal retrieval aims to learn discriminative and modal-invariant features for data from different modalities. Unlike the existing methods which usually learn from the features extracted by offline networks, in this paper, we propose an approach to jointly train the components of cross-modal retrieval framework with metadata, and enable the network to find optimal features. The proposed end-to-end framework is updated with three loss functions: 1) a novel cross-modal center loss to eliminate cross-modal discrepancy, 2) cross-entropy loss to maximize inter-class variations, and 3) mean-square-error loss to reduce modality variations. In particular, our proposed cross-modal center loss minimizes the distances of features from objects belonging to the same class across all modalities. Extensive experiments have been conducted on the retrieval tasks across multi-modalities, including 2D image, 3D point cloud, and mesh data. The proposed framework significantly outperforms the state-of-the-art methods on the ModelNet40 dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题