RGB-D显着对象检测的跨模式加权网络

论文标题

RGB-D显着对象检测的跨模式加权网络

Cross-Modal Weighting Network for RGB-D Salient Object Detection

论文作者

Li, Gongyang, Liu, Zhi, Ye, Linwei, Wang, Yang, Ling, Haibin

论文摘要

深度图包含用于协助显着对象检测（SOD）的几何线索。在本文中，我们提出了一种新型的跨模式加权（CMW）策略，以鼓励RGB和RGB-D SOD的深度通道之间进行全面的相互作用。具体而言，开发了三个称为CMW-L，CMW-M和CMW-H的RGB深度交互模块，以分别处理低水平，中和高级跨模式信息融合。这些模块使用深度到RGB称重（DW）和RGB-TO-RGB加权（RW），以允许不同网络块生成的特征层之间富含的跨模式和跨尺度相互作用。为了有效地训练提出的跨模式加权网络（CMWNET），我们设计了一个复合损耗函数，该函数总结了不同尺度上中间预测和地面真相之间的错误。随着所有这些新型组件共同工作，CMWNET有效地融合了来自RGB和深度通道的信息，同时探讨了对象定位和跨尺度的细节。彻底的评估表明，在七个流行的基准测试上，CMWNET始终优于15种最先进的RGB-D SOD方法。

Depth maps contain geometric clues for assisting Salient Object Detection (SOD). In this paper, we propose a novel Cross-Modal Weighting (CMW) strategy to encourage comprehensive interactions between RGB and depth channels for RGB-D SOD. Specifically, three RGB-depth interaction modules, named CMW-L, CMW-M and CMW-H, are developed to deal with respectively low-, middle- and high-level cross-modal information fusion. These modules use Depth-to-RGB Weighing (DW) and RGB-to-RGB Weighting (RW) to allow rich cross-modal and cross-scale interactions among feature layers generated by different network blocks. To effectively train the proposed Cross-Modal Weighting Network (CMWNet), we design a composite loss function that summarizes the errors between intermediate predictions and ground truth over different scales. With all these novel components working together, CMWNet effectively fuses information from RGB and depth channels, and meanwhile explores object localization and details across scales. Thorough evaluations demonstrate CMWNet consistently outperforms 15 state-of-the-art RGB-D SOD methods on seven popular benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题