论文标题
使用感知重建的无监督的单发深度估计
Unsupervised Single-shot Depth Estimation using Perceptual Reconstruction
论文作者
论文摘要
实时估计实际对象深度是各种自主系统任务(例如3D重建,场景理解和状况评估)的重要模块。在机器学习的最后十年中,将深度学习方法的广泛部署到计算机视觉任务中产生了成功,从而成功地从简单的RGB模式中实现了现实的深度综合。这些模型中的大多数基于配对的RGB深度数据和/或视频序列和立体声图像的可用性。到目前为止,缺乏序列,立体声数据和RGB深度对使深度估计成为一个完全无监督的单像转移问题,到目前为止几乎没有探索过。这项研究以生成神经网络领域的最新进展为基础,以建立完全无监督的单发深度估计。使用Wasserstein-1距离(一种新型的感知重建项和手工制作的图像过滤器)实现并同时优化了两个用于RGB至深度和深度RGB传输的发电机。我们使用工业表面深度数据以及德克萨斯州3D面部识别数据库,人类肖像的Celebamask-HQ数据库以及记录人体深度的超现实数据集来全面评估模型。对于每个评估数据集,与最先进的单图像转移方法相比,提出的方法显示出深度准确性的显着提高。
Real-time estimation of actual object depth is an essential module for various autonomous system tasks such as 3D reconstruction, scene understanding and condition assessment. During the last decade of machine learning, extensive deployment of deep learning methods to computer vision tasks has yielded approaches that succeed in achieving realistic depth synthesis out of a simple RGB modality. Most of these models are based on paired RGB-depth data and/or the availability of video sequences and stereo images. The lack of sequences, stereo data and RGB-depth pairs makes depth estimation a fully unsupervised single-image transfer problem that has barely been explored so far. This study builds on recent advances in the field of generative neural networks in order to establish fully unsupervised single-shot depth estimation. Two generators for RGB-to-depth and depth-to-RGB transfer are implemented and simultaneously optimized using the Wasserstein-1 distance, a novel perceptual reconstruction term and hand-crafted image filters. We comprehensively evaluate the models using industrial surface depth data as well as the Texas 3D Face Recognition Database, the CelebAMask-HQ database of human portraits and the SURREAL dataset that records body depth. For each evaluation dataset the proposed method shows a significant increase in depth accuracy compared to state-of-the-art single-image transfer methods.