水下图像的语义分割：数据集和基准测试

论文标题

水下图像的语义分割：数据集和基准测试

Semantic Segmentation of Underwater Imagery: Dataset and Benchmark

论文作者

Islam, Md Jahidul, Edge, Chelsey, Xiao, Yuyang, Luo, Peigen, Mehtaz, Muntaqim, Morse, Christopher, Enan, Sadman Sakib, Sattar, Junaed

论文摘要

在本文中，我们介绍了第一个用于水下图像语义分割（SUIM）语义分割的大规模数据集。它包含八个物体类别的1500多个图像，带有像素注释：鱼（脊椎动物），礁石（无脊椎动物），水生植物，残骸/废墟，人类潜水员，机器人和海底。这些图像是在海洋探索和人类机器人协作实验中严格收集的，并由人类参与者注释。我们还基于标准性能指标对最新语义细分方法进行了基准评估。此外，我们提出了SUIM-NET，这是一种完全跨跨编码器模型，可以平衡性能和计算效率之间的权衡。它提供竞争性能，同时确保快速的端到端推理，这对于在视觉引导的水下机器人的自主管道中使用至关重要。特别是，我们证明了其可用性在视觉宣传，显着性预测和详细的场景理解中。在各种用例中，提议的模型和基准数据集为水下机器人视觉中的未来研究提供了有希望的机会。

In this paper, we present the first large-scale dataset for semantic Segmentation of Underwater IMagery (SUIM). It contains over 1500 images with pixel annotations for eight object categories: fish (vertebrates), reefs (invertebrates), aquatic plants, wrecks/ruins, human divers, robots, and sea-floor. The images have been rigorously collected during oceanic explorations and human-robot collaborative experiments, and annotated by human participants. We also present a benchmark evaluation of state-of-the-art semantic segmentation approaches based on standard performance metrics. In addition, we present SUIM-Net, a fully-convolutional encoder-decoder model that balances the trade-off between performance and computational efficiency. It offers competitive performance while ensuring fast end-to-end inference, which is essential for its use in the autonomy pipeline of visually-guided underwater robots. In particular, we demonstrate its usability benefits for visual servoing, saliency prediction, and detailed scene understanding. With a variety of use cases, the proposed model and benchmark dataset open up promising opportunities for future research in underwater robot vision.

下载PDF全文

下载文献需遵守相关版权规定

论文标题