论文标题
RGBT显着对象检测:大规模数据集和基准测试
RGBT Salient Object Detection: A Large-scale Dataset and Benchmark
论文作者
论文摘要
复杂场景和环境中的显着对象检测是一个具有挑战性的研究主题。大多数作品都集中在基于RGB的显着对象检测上,这在面对不利条件(例如黑暗环境和复杂的背景)时限制了其现实生活应用的性能。利用RGB和热红外图像成为最近在复杂场景中检测显着对象的新研究方向,因为热红外频谱成像提供了互补信息,并已应用于许多计算机视觉任务。但是,当前对RGBT显着对象检测的研究受到缺乏大规模数据集和全面基准的限制。这项工作贡献了称为VT5000的RGBT图像数据集,其中包括5000个空间对齐的RGBT图像对和地面真相注释。 VT5000在不同的场景和环境中有11个挑战,用于探索算法的鲁棒性。使用此数据集,我们提出了一种强大的基线方法,该方法在每种模式中提取多层次特征,并通过注意机制将所有模式的这些特征汇总为这些特征,以进行准确的RGBT显着对象检测。广泛的实验表明,所提出的基线方法的表现优于VT5000数据集和其他两个公共数据集上的最先进方法。此外,我们对VT5000数据集的RGBT显着对象检测的不同算法进行了全面分析,然后得出一些有价值的结论,并为RGBT显着对象检测提供了一些潜在的研究方向。
Salient object detection in complex scenes and environments is a challenging research topic. Most works focus on RGB-based salient object detection, which limits its performance of real-life applications when confronted with adverse conditions such as dark environments and complex backgrounds. Taking advantage of RGB and thermal infrared images becomes a new research direction for detecting salient object in complex scenes recently, as thermal infrared spectrum imaging provides the complementary information and has been applied to many computer vision tasks. However, current research for RGBT salient object detection is limited by the lack of a large-scale dataset and comprehensive benchmark. This work contributes such a RGBT image dataset named VT5000, including 5000 spatially aligned RGBT image pairs with ground truth annotations. VT5000 has 11 challenges collected in different scenes and environments for exploring the robustness of algorithms. With this dataset, we propose a powerful baseline approach, which extracts multi-level features within each modality and aggregates these features of all modalities with the attention mechanism, for accurate RGBT salient object detection. Extensive experiments show that the proposed baseline approach outperforms the state-of-the-art methods on VT5000 dataset and other two public datasets. In addition, we carry out a comprehensive analysis of different algorithms of RGBT salient object detection on VT5000 dataset, and then make several valuable conclusions and provide some potential research directions for RGBT salient object detection.