RGB-D面部识别的两级基于注意的融合学习

论文标题

RGB-D面部识别的两级基于注意的融合学习

Two-Level Attention-based Fusion Learning for RGB-D Face Recognition

论文作者

Uppal, Hardik, Sepas-Moghaddam, Alireza, Greenspan, Michael, Etemad, Ali

论文摘要

随着RGB-D传感技术的最新进展以及机器学习和融合技术的改进，RGB-D面部识别已成为一个积极的研究领域。提出了一种新颖的注意方法，以融合两种图像方式RGB和深度，以增强RGB-D面部识别。所提出的方法首先使用卷积特征提取器提取两种模式的特征。然后使用两层注意机制融合这些特征。第一层着重于功能提取器生成的融合特征地图，使用LSTM经常性学习利用特征地图之间的关系。第二层使用卷积着眼于这些地图的空间特征。通过一组几何变换对培训数据库进行了预处理和增强，并使用从纯2D RGB图像训练过程中转移学习进一步帮助学习过程。比较评估表明，所提出的方法在挑战性的Curtinfaces和IIIT-D RGB-D基准数据库上优于其他最先进的方法，包括传统和深层基于神经网络的方法，分别达到98.2％和99.3％的分类精度。还将提出的注意机制与其他注意机制进行了比较，显示出更准确的结果。

With recent advances in RGB-D sensing technologies as well as improvements in machine learning and fusion techniques, RGB-D facial recognition has become an active area of research. A novel attention aware method is proposed to fuse two image modalities, RGB and depth, for enhanced RGB-D facial recognition. The proposed method first extracts features from both modalities using a convolutional feature extractor. These features are then fused using a two-layer attention mechanism. The first layer focuses on the fused feature maps generated by the feature extractor, exploiting the relationship between feature maps using LSTM recurrent learning. The second layer focuses on the spatial features of those maps using convolution. The training database is preprocessed and augmented through a set of geometric transformations, and the learning process is further aided using transfer learning from a pure 2D RGB image training process. Comparative evaluations demonstrate that the proposed method outperforms other state-of-the-art approaches, including both traditional and deep neural network-based methods, on the challenging CurtinFaces and IIIT-D RGB-D benchmark databases, achieving classification accuracies over 98.2% and 99.3% respectively. The proposed attention mechanism is also compared with other attention mechanisms, demonstrating more accurate results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题