Semattnet：朝基于注意的语义意识到的指导深度完成

论文标题

Semattnet：朝基于注意的语义意识到的指导深度完成

SemAttNet: Towards Attention-based Semantic Aware Guided Depth Completion

论文作者

Nazir, Danish, Liwicki, Marcus, Stricker, Didier, Afzal, Muhammad Zeshan

论文摘要

深度完成涉及从稀疏图和RGB图像中恢复密集的深度图。最近的方法着重于利用颜色图像作为指导图像来恢复无效像素的深度。但是，仅彩色图像就不足以提供对场景的必要语义理解。因此，深度完成任务遭受了RGB图像的突然照明变化（例如阴影）。在本文中，我们提出了一个新型的三个分支主链，其中包括颜色引导，语义引导和深度引导分支。具体而言，颜色引导的分支将稀疏的深度图和RGB图像作为输入，并生成颜色深度，其中包括场景的颜色提示（例如，对象边界）。沿着语义图像和稀疏深度图的颜色引导分支的预测密集深度图被输入到语义引导分支中，用于估计语义深度。深度引导的分支需要稀疏，颜色和语义深度来生成密集的深度图。颜色深度，语义深度和引导深度可自适应融合，以产生我们提出的三支支线骨架的输出。此外，我们还建议将语义感知的多模式注意融合块（SAMMAFB）应用于所有三个分支之间的功能。我们进一步将CSPN ++与非常稳定的卷积一起使用，以完善我们三个分支骨架产生的密集深度图。广泛的实验表明，我们的模型在提交时实现了Kitti深度完成基准的最新性能。

Depth completion involves recovering a dense depth map from a sparse map and an RGB image. Recent approaches focus on utilizing color images as guidance images to recover depth at invalid pixels. However, color images alone are not enough to provide the necessary semantic understanding of the scene. Consequently, the depth completion task suffers from sudden illumination changes in RGB images (e.g., shadows). In this paper, we propose a novel three-branch backbone comprising color-guided, semantic-guided, and depth-guided branches. Specifically, the color-guided branch takes a sparse depth map and RGB image as an input and generates color depth which includes color cues (e.g., object boundaries) of the scene. The predicted dense depth map of color-guided branch along-with semantic image and sparse depth map is passed as input to semantic-guided branch for estimating semantic depth. The depth-guided branch takes sparse, color, and semantic depths to generate the dense depth map. The color depth, semantic depth, and guided depth are adaptively fused to produce the output of our proposed three-branch backbone. In addition, we also propose to apply semantic-aware multi-modal attention-based fusion block (SAMMAFB) to fuse features between all three branches. We further use CSPN++ with Atrous convolutions to refine the dense depth map produced by our three-branch backbone. Extensive experiments show that our model achieves state-of-the-art performance in the KITTI depth completion benchmark at the time of submission.

下载PDF全文

下载文献需遵守相关版权规定

论文标题