实时RGBD语义细分的空间信息指导卷积

论文标题

实时RGBD语义细分的空间信息指导卷积

Spatial Information Guided Convolution for Real-Time RGBD Semantic Segmentation

论文作者

Chen, Lin-Zhuo, Lin, Zheng, Wang, Ziqin, Yang, Yong-Liang, Cheng, Ming-Ming

论文摘要

已知3D空间信息对语义分割任务有益。大多数现有方法将3D空间数据作为额外的输入，从而导致两流分割网络分别处理RGB和3D空间信息。该解决方案大大增加了推理时间，并严重限制了其用于实时应用的范围。为了解决此问题，我们提出了空间信息引导卷积（S-CONV），该卷积允许有效的RGB功能和3D空间信息集成。 S-CONV有能力推断以3D空间信息为引导的卷积内核的采样偏移，从而帮助卷积层调整了接受场并适应几何变换。 S-CONV还通过生成空间自适应卷积权重将几何信息纳入特征学习过程。感知几何形状的能力在很大程度上增强了，而没有太大影响参数和计算成本的数量。我们将S-CONV进一步嵌入了语义分割网络，称为“空间信息”指导卷积网络（SGNET），从而在NYUDV2和SUNRGBD数据集中进行了实时推理和最先进的性能。

3D spatial information is known to be beneficial to the semantic segmentation task. Most existing methods take 3D spatial data as an additional input, leading to a two-stream segmentation network that processes RGB and 3D spatial information separately. This solution greatly increases the inference time and severely limits its scope for real-time applications. To solve this problem, we propose Spatial information guided Convolution (S-Conv), which allows efficient RGB feature and 3D spatial information integration. S-Conv is competent to infer the sampling offset of the convolution kernel guided by the 3D spatial information, helping the convolutional layer adjust the receptive field and adapt to geometric transformations. S-Conv also incorporates geometric information into the feature learning process by generating spatially adaptive convolutional weights. The capability of perceiving geometry is largely enhanced without much affecting the amount of parameters and computational cost. We further embed S-Conv into a semantic segmentation network, called Spatial information Guided convolutional Network (SGNet), resulting in real-time inference and state-of-the-art performance on NYUDv2 and SUNRGBD datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题