几何多模式对比表示学习

论文标题

几何多模式对比表示学习

Geometric Multimodal Contrastive Representation Learning

论文作者

Poklukar, Petra, Vasco, Miguel, Yin, Hang, Melo, Francisco S., Paiva, Ana, Kragic, Danica

论文摘要

多模式数据的学习表示，在测试时间丢失模式的多模式数据，由于从不同渠道获得的数据的固有异质性，这仍然是一个具有挑战性的问题。为了解决这个问题，我们提出了一种新型的几何多模式对比（GMC）表示方法，该学习方法由两个主要组成部分组成：i）由特定于模态的基础编码器组成的两级体系结构，可以处理任意数量的模态数字，以将固定尺寸的中间表示形式绘制为固定的代表，并映射了一个插图的代表，以绘制一个插入的代表。 ii）一种多模式对比损失函数，鼓励学习的表示形式的几何对齐。我们从实验上证明，GMC表示在语义上是丰富的，并实现了最先进的表现，而缺少有关三种不同学习问题的模式信息，包括预测和强化学习任务。

Learning representations of multimodal data that are both informative and robust to missing modalities at test time remains a challenging problem due to the inherent heterogeneity of data obtained from different channels. To address it, we present a novel Geometric Multimodal Contrastive (GMC) representation learning method consisting of two main components: i) a two-level architecture consisting of modality-specific base encoders, allowing to process an arbitrary number of modalities to an intermediate representation of fixed dimensionality, and a shared projection head, mapping the intermediate representations to a latent representation space; ii) a multimodal contrastive loss function that encourages the geometric alignment of the learned representations. We experimentally demonstrate that GMC representations are semantically rich and achieve state-of-the-art performance with missing modality information on three different learning problems including prediction and reinforcement learning tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题