论文标题
增加范围无监督的单眼估计
Increased-Range Unsupervised Monocular Depth Estimation
论文作者
论文摘要
无监督的深度学习方法显示了单位深度估计的有希望的性能。由于这些方法中的大多数都使用双眼立体评估进行自学,因此深度范围通常受到限制。小基线立体声对提供了较小的深度范围,但可以很好地手柄遮挡。另一方面,用宽基线钻机获取的立体图像在近距离造成了与遮挡相关的错误,但在远处估计深度。在这项工作中,我们建议将小基线和宽阔的基线的优势整合在一起。通过使用三个水平对齐视图训练网络,我们可以为近距离和远距离获得准确的深度预测。我们的策略允许从单个图像推断多基线深度。这与以前使用超过两个相机的多基线系统不同。定性和定量结果表明,多基线方法的性能优于先前基于立体的单眼方法。对于0.1至80米的深度范围,与MonoDepth2相比,我们的方法将深度的绝对相对误差降低了24%。我们的方法每秒在单个NVIDIA1080 GPU上提供21帧,使其对实际应用有用。
Unsupervised deep learning methods have shown promising performance for single-image depth estimation. Since most of these methods use binocular stereo pairs for self-supervision, the depth range is generally limited. Small-baseline stereo pairs provide small depth range but handle occlusions well. On the other hand, stereo images acquired with a wide-baseline rig cause occlusions-related errors in the near range but estimate depth well in the far range. In this work, we propose to integrate the advantages of the small and wide baselines. By training the network using three horizontally aligned views, we obtain accurate depth predictions for both close and far ranges. Our strategy allows to infer multi-baseline depth from a single image. This is unlike previous multi-baseline systems which employ more than two cameras. The qualitative and quantitative results show the superior performance of multi-baseline approach over previous stereo-based monocular methods. For 0.1 to 80 meters depth range, our approach decreases the absolute relative error of depth by 24% compared to Monodepth2. Our approach provides 21 frames per second on a single Nvidia1080 GPU, making it useful for practical applications.