MANET：基于多模式注意网络的点 - 3D形状识别的融合

论文标题

MANET：基于多模式注意网络的点 - 3D形状识别的融合

MANet: Multimodal Attention Network based Point- View fusion for 3D Shape Recognition

论文作者

Zhao, Yaxin, Jiao, Jichao, Zhang, Tangkun

论文摘要

作为3D视觉研究的任务，3D形状的识别吸引了越来越多的关注。 3D数据的扩散鼓励基于3D数据的各种深度学习方法。现在，仅基于点云数据或多视图数据就有许多深度学习模型。但是，在大数据的时代，整合两种不同模态以获得统一的3D形状描述符的数据必定会提高识别精度。因此，本文提出了一个基于3D形状识别的多模式注意机制的融合网络。考虑到多视图数据的局限性，我们引入了一个软注意力方案，该方案可以使用全局点云功能来过滤多视图功能，然后实现这两个功能的有效融合。更具体地说，我们通过挖掘每个多视图图像对整体形状识别的贡献，然后融合点云功能和增强的多视图功能来获得增强的多视图功能，从而获得更具歧视性的3D形状描述符。我们已经在ModelNet40数据集上进行了相关实验，实验结果验证了我们方法的有效性。

3D shape recognition has attracted more and more attention as a task of 3D vision research. The proliferation of 3D data encourages various deep learning methods based on 3D data. Now there have been many deep learning models based on point-cloud data or multi-view data alone. However, in the era of big data, integrating data of two different modals to obtain a unified 3D shape descriptor is bound to improve the recognition accuracy. Therefore, this paper proposes a fusion network based on multimodal attention mechanism for 3D shape recognition. Considering the limitations of multi-view data, we introduce a soft attention scheme, which can use the global point-cloud features to filter the multi-view features, and then realize the effective fusion of the two features. More specifically, we obtain the enhanced multi-view features by mining the contribution of each multi-view image to the overall shape recognition, and then fuse the point-cloud features and the enhanced multi-view features to obtain a more discriminative 3D shape descriptor. We have performed relevant experiments on the ModelNet40 dataset, and experimental results verify the effectiveness of our method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题