论文标题
与上下文感知的组成词在闭塞下的强大对象检测
Robust Object Detection under Occlusion with Context-Aware CompositionalNets
论文作者
论文摘要
检测部分遮挡的对象是一项艰巨的任务。我们的实验结果表明,在闭塞下对象检测,深度学习方法(例如更快的R-CNN)并不强大。组成卷积神经网络(组成词)已通过将对象明确表示为零件的组成来对遮挡对象进行分类。在这项工作中,我们建议克服组合物的两个局限性,这将使它们能够检测到部分遮挡的对象:1)组成词A和其他DCNN体系结构,并不能明确将上下文表示的表示与对象本身分开。在强对象阻塞的情况下,上下文的影响会放大,这可能会在测试时对检测产生严重的负面影响。为了克服这一点,我们建议通过边界框注释在培训期间分割上下文。然后,我们使用细分来学习一个散布上下文表示和对象表示的上下文感知构图。 2)我们将基于零件的投票方案扩展到CoptionalNnets中,以投票选出对象边界框的角落,这使该模型能够可靠地估算部分遮挡的对象的边界框。我们的广泛实验表明,我们提出的模型可以稳健地检测物体,从而提高了Pascal3d+和MS-Coco的强咬合车辆的检测性能,分别在绝对性能中分别提高了41%和35%,而R-CNN的检测性能分别提高了绝对性能的检测性能。
Detecting partially occluded objects is a difficult task. Our experimental results show that deep learning approaches, such as Faster R-CNN, are not robust at object detection under occlusion. Compositional convolutional neural networks (CompositionalNets) have been shown to be robust at classifying occluded objects by explicitly representing the object as a composition of parts. In this work, we propose to overcome two limitations of CompositionalNets which will enable them to detect partially occluded objects: 1) CompositionalNets, as well as other DCNN architectures, do not explicitly separate the representation of the context from the object itself. Under strong object occlusion, the influence of the context is amplified which can have severe negative effects for detection at test time. In order to overcome this, we propose to segment the context during training via bounding box annotations. We then use the segmentation to learn a context-aware CompositionalNet that disentangles the representation of the context and the object. 2) We extend the part-based voting scheme in CompositionalNets to vote for the corners of the object's bounding box, which enables the model to reliably estimate bounding boxes for partially occluded objects. Our extensive experiments show that our proposed model can detect objects robustly, increasing the detection performance of strongly occluded vehicles from PASCAL3D+ and MS-COCO by 41% and 35% respectively in absolute performance relative to Faster R-CNN.