基于深度学习的单图像面部深度数据增强

论文标题

基于深度学习的单图像面部深度数据增强

Deep Learning-based Single Image Face Depth Data Enhancement

论文作者

Schlett, Torsten, Rathgeb, Christian, Busch, Christoph

论文摘要

面部识别可以受益于使用低成本摄像机捕获的深度数据的利用，特别是用于表现攻击检测目的。但是，这些捕获设备的深度视频输出可能包含缺陷，例如孔或一般深度不准确。这项工作提出了在面部生物识别技术的这种背景下进行深度学习的面部深度增强方法，这为主题增加了安全方面。使用了类似U-NET的架构，并将网络与手工制作的增强器类型进行比较，以及与针对相邻应用程序场景培训的相关工作的类似深度增强器网络。所有经过测试的增强器类型专门使用深度数据作为输入，这与基于其他输入数据（例如可见光浅色图像）增强深度的方法不同。合成面部深度地面真相图像和其退化形式是在PRNET的帮助下创建的，以训练具有不同网络大小和培训配置的多个深度学习增强器模型。在合成数据，KinectFaceDB的Kinect V1图像以及内部Realsense D435图像上进行评估。这些评估包括对伪造面部深度输入的伪造评估，这与生物识别安全有关。提出的深度学习增强剂比预先存在的增强剂的结果明显更好，而当提供非面积输入时，没有过度伪造深度数据，并且显示出可减少简单的基于地标的PAD方法的误差。

Face recognition can benefit from the utilization of depth data captured using low-cost cameras, in particular for presentation attack detection purposes. Depth video output from these capture devices can however contain defects such as holes or general depth inaccuracies. This work proposes a deep learning face depth enhancement method in this context of facial biometrics, which adds a security aspect to the topic. U-Net-like architectures are utilized, and the networks are compared against hand-crafted enhancer types, as well as a similar depth enhancer network from related work trained for an adjacent application scenario. All tested enhancer types exclusively use depth data as input, which differs from methods that enhance depth based on additional input data such as visible light color images. Synthetic face depth ground truth images and degraded forms thereof are created with help of PRNet, to train multiple deep learning enhancer models with different network sizes and training configurations. Evaluations are carried out on the synthetic data, on Kinect v1 images from the KinectFaceDB, and on in-house RealSense D435 images. These evaluations include an assessment of the falsification for occluded face depth input, which is relevant to biometric security. The proposed deep learning enhancers yield noticeably better results than the tested preexisting enhancers, without overly falsifying depth data when non-face input is provided, and are shown to reduce the error of a simple landmark-based PAD method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题