旋转不变对象识别的多感官学习体系结构

论文标题

旋转不变对象识别的多感官学习体系结构

A Multisensory Learning Architecture for Rotation-invariant Object Recognition

论文作者

Kirtay, Murat, Schillaci, Guido, Hafner, Verena V.

论文摘要

这项研究通过采用了由ICUB机器人构建的新型数据集，为对象识别提供了多感觉机器学习体系结构，该数据集配备了三个摄像头和一个深度传感器。所提出的体系结构结合了卷积神经网络，以形成灰色颜色图像的表示（即功能），并形成多层感知器算法以处理深度数据。为此，我们旨在学习不同方式（例如颜色和深度）的共同表示，并利用它们来识别对象。我们通过根据不同传感器的输入和最先进的数据融合技术（即决策水平融合）分别训练的模型来评估所提出的体系结构的性能。结果表明，与使用单个模式和决策水平多模式融合方法的模型相比，我们的体系结构提高了识别精度。

This study presents a multisensory machine learning architecture for object recognition by employing a novel dataset that was constructed with the iCub robot, which is equipped with three cameras and a depth sensor. The proposed architecture combines convolutional neural networks to form representations (i.e., features) for grayscaled color images and a multi-layer perceptron algorithm to process depth data. To this end, we aimed to learn joint representations of different modalities (e.g., color and depth) and employ them for recognizing objects. We evaluate the performance of the proposed architecture by benchmarking the results obtained with the models trained separately with the input of different sensors and a state-of-the-art data fusion technique, namely decision level fusion. The results show that our architecture improves the recognition accuracy compared with the models that use inputs from a single modality and decision level multimodal fusion method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题