通过模式间旋转RGB-D对象识别的无监督域的适应

论文标题

通过模式间旋转RGB-D对象识别的无监督域的适应

Unsupervised Domain Adaptation through Inter-modal Rotation for RGB-D Object Recognition

论文作者

Loghmani, Mohammad Reza, Robbiano, Luca, Planamente, Mirco, Park, Kiru, Caputo, Barbara, Vincze, Markus

论文摘要

无监督的域改编（DA）利用了富含标签的源数据集的监督，通过对齐两个数据分布来对未标记的目标数据集进行预测。在机器人技术中，DA用于利用“免费”注释随附的自动生成的合成数据，以对真实数据进行有效的预测。但是，现有的DA方法并非旨在应对RGB-D数据的多模式性质，而RGB-D数据被广泛用于机器人视觉中。我们提出了一种新型的RGB-D DA方法，该方法通过利用RGB和深度图像之间的模式间关系来降低合成域转移。我们的方法还包括训练卷积神经网络以解决主要识别任务，即预测RGB和深度图像之间相对旋转的借口任务。为了评估我们的方法并鼓励在这一领域进行进一步的研究，我们为对象分类和实例识别定义了两个基准数据集。通过广泛的实验，我们显示了利用RGB-D DA的模式间关系的好处。

Unsupervised Domain Adaptation (DA) exploits the supervision of a label-rich source dataset to make predictions on an unlabeled target dataset by aligning the two data distributions. In robotics, DA is used to take advantage of automatically generated synthetic data, that come with "free" annotation, to make effective predictions on real data. However, existing DA methods are not designed to cope with the multi-modal nature of RGB-D data, which are widely used in robotic vision. We propose a novel RGB-D DA method that reduces the synthetic-to-real domain shift by exploiting the inter-modal relation between the RGB and depth image. Our method consists of training a convolutional neural network to solve, in addition to the main recognition task, the pretext task of predicting the relative rotation between the RGB and depth image. To evaluate our method and encourage further research in this area, we define two benchmark datasets for object categorization and instance recognition. With extensive experiments, we show the benefits of leveraging the inter-modal relations for RGB-D DA.

下载PDF全文

下载文献需遵守相关版权规定

论文标题