语义深度伴侣：语义和深度的深度耦合的半监督学习

论文标题

语义深度伴侣：语义和深度的深度耦合的半监督学习

Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of Semantics and Depth

论文作者

Bansal, Nitin, Ji, Pan, Yuan, Junsong, Xu, Yi

论文摘要

多任务学习（MTL）范式着重于共同学习两个或多个任务，旨在重大改进W.R.T模型的概括性，性能和培训/推理记忆足迹。在与视觉相关的{\ bf密度}的预测任务的联合培训的情况下，上述好处是必不可少的。在这项工作中，我们解决了两个密集任务的MTL问题，即语义分割和深度估计，并提出了一个新颖的注意力模块，称为跨渠道注意模块（{CCAM}），促进了两个任务之间沿每个频道的有效功能共享，从而导致相互绩效增强，从而获得了可忽视的火车参数的增加。然后，我们以一种真正的共生精神，使用称为{affinemix}的预测深度为语义分割任务制定新的数据增强，并使用称为{coloraug}的预测语义进行了简单的深度增强。最后，我们验证了CityScapes和Scannet数据集的拟议方法的性能增益，这有助于我们基于深度和语义分割的半监督联合模型实现最先进的结果。

Multi-task learning (MTL) paradigm focuses on jointly learning two or more tasks, aiming for significant improvement w.r.t model's generalizability, performance, and training/inference memory footprint. The aforementioned benefits become ever so indispensable in the case of joint training for vision-related {\bf dense} prediction tasks. In this work, we tackle the MTL problem of two dense tasks, i.e., semantic segmentation and depth estimation, and present a novel attention module called Cross-Channel Attention Module ({CCAM}), which facilitates effective feature sharing along each channel between the two tasks, leading to mutual performance gain with a negligible increase in trainable parameters. In a true symbiotic spirit, we then formulate a novel data augmentation for the semantic segmentation task using predicted depth called {AffineMix}, and a simple depth augmentation using predicted semantics called {ColorAug}. Finally, we validate the performance gain of the proposed method on the Cityscapes and ScanNet dataset, which helps us achieve state-of-the-art results for a semi-supervised joint model based on depth and semantic segmentation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题