EPSNET：具有跨层注意融合的有效的全景分割网络

论文标题

EPSNET：具有跨层注意融合的有效的全景分割网络

EPSNet: Efficient Panoptic Segmentation Network with Cross-layer Attention Fusion

论文作者

Chang, Chia-Yuan, Chang, Shuo-En, Hsiao, Pei-Yung, Fu, Li-Chen

论文摘要

Panoptic分割是一个场景解析任务，将语义分割和实例分割统一为一个任务。但是，当前的最新研究并未对推理时间过多关注。在这项工作中，我们提出了一个有效的全景分割网络（EPSNET），以快速推理速度来解决全景分割任务。基本上，EPSNET基于原型面具和掩模系数的简单线性组合生成掩模。轻量重量网络分支（例如分割和语义分割）只需要预测掩模系数并产生带有原型网络分支预测的共享原型的掩码。此外，为了提高共享原型的质量，我们采用了一个称为“跨层注意融合模块”的模块，该模块汇总了多尺度特征，并通过注意机制帮助他们捕获彼此之间的长距离依赖性。为了验证拟议的工作，我们已经对具有挑战性的可可泳数据集进行了各种实验，这些数据集以高度更快的推理速度（GPU上的53ms）实现了高度有希望的性能。

Panoptic segmentation is a scene parsing task which unifies semantic segmentation and instance segmentation into one single task. However, the current state-of-the-art studies did not take too much concern on inference time. In this work, we propose an Efficient Panoptic Segmentation Network (EPSNet) to tackle the panoptic segmentation tasks with fast inference speed. Basically, EPSNet generates masks based on simple linear combination of prototype masks and mask coefficients. The light-weight network branches for instance segmentation and semantic segmentation only need to predict mask coefficients and produce masks with the shared prototypes predicted by prototype network branch. Furthermore, to enhance the quality of shared prototypes, we adopt a module called "cross-layer attention fusion module", which aggregates the multi-scale features with attention mechanism helping them capture the long-range dependencies between each other. To validate the proposed work, we have conducted various experiments on the challenging COCO panoptic dataset, which achieve highly promising performance with significantly faster inference speed (53ms on GPU).

下载PDF全文

下载文献需遵守相关版权规定

论文标题