带有完全测试时间的RGB-D嵌入式改编的看不见的对象实例分割

论文标题

带有完全测试时间的RGB-D嵌入式改编的看不见的对象实例分割

Unseen Object Instance Segmentation with Fully Test-time RGB-D Embeddings Adaptation

论文作者

Zhang, Lu, Zhang, Siqi, Yang, Xu, Qiao, Hong, Liu, Zhiyong

论文摘要

分割看不见的对象是机器人的关键能力，因为它可能在操作过程中遇到新环境。最近，一个流行的解决方案是利用大规模合成数据的RGB-D特征，并直接将模型应用于看不见的现实世界情景。但是，由SIM2REAL间隙引起的域移位是不可避免的，对分割模型构成了至关重要的挑战。在本文中，我们强调了SIM2REAL域之间的适应过程，并将其模拟为模拟训练模型的批处理参数的学习问题。具体而言，我们提出了一个新型的非参数熵目标，该目标以开放世界的方式为测试时间适应的学习目标制定了学习目标。然后，进一步设计了跨模式知识蒸馏目标，以鼓励测试时间知识转移以增强功能。我们的方法只能仅使用测试图像有效地实施，而无需注释或重新审视大规模合成训练数据。除了节省大量时间外，该提出的方法始终改善对重叠和边界指标的细分结果，从而在看不见的对象实例分割上实现了最新的性能。

Segmenting unseen objects is a crucial ability for the robot since it may encounter new environments during the operation. Recently, a popular solution is leveraging RGB-D features of large-scale synthetic data and directly applying the model to unseen real-world scenarios. However, the domain shift caused by the sim2real gap is inevitable, posing a crucial challenge to the segmentation model. In this paper, we emphasize the adaptation process across sim2real domains and model it as a learning problem on the BatchNorm parameters of a simulation-trained model. Specifically, we propose a novel non-parametric entropy objective, which formulates the learning objective for the test-time adaptation in an open-world manner. Then, a cross-modality knowledge distillation objective is further designed to encourage the test-time knowledge transfer for feature enhancement. Our approach can be efficiently implemented with only test images, without requiring annotations or revisiting the large-scale synthetic training data. Besides significant time savings, the proposed method consistently improves segmentation results on the overlap and boundary metrics, achieving state-of-the-art performance on unseen object instance segmentation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题