体积3D计算机断层扫描安全筛查图像中的多级3D对象检测

论文标题

体积3D计算机断层扫描安全筛查图像中的多级3D对象检测

Multi-Class 3D Object Detection Within Volumetric 3D Computed Tomography Baggage Security Screening Imagery

论文作者

Wang, Qian, Bhowmik, Neelanjan, Breckon, Toby P.

论文摘要

自动检测乘客行李内禁止物体对于航空安全很重要。基于X射线计算机层析成像（CT）的3D成像被广泛用于航空安全筛查，而先前的自动禁止项目检测的工作主要用于2D X射线成像。这些作品证明了扩展基于深度卷积神经网络（CNN）自动禁止的项目检测的可能性，从2D X射线图像到体积3D CT CT Baggage Security Security筛选图像。但是，对行李安全筛选图像中3D对象检测的先前工作集中在检测一种特定类型的对象（例如{\ it瓶装}或{\ it Handguns}）上。结果，如果需要在实践中检测到多种类型的禁止项目，则需要多个模型。在本文中，我们考虑使用一个统一框架检测多个对象类别。为此，我们在3D CT图像中制定了更具挑战性的多级3D对象检测问题，并提出了一个可行的解决方案（3D视网膜）来解决此问题。为了提高检测的性能，我们研究了各种策略，包括数据增强和变化的骨干网络。进行的实验是为了提供对3D CT行李安全筛选图像中多级3D对象检测方法的定量和定性评估。实验结果表明，在五个对象类别（即{\ it瓶，手枪，双筒望远镜，glock框架，iPods}），3D视网膜和一系列有利策略的组合可以达到平均平均精度（MAP）为65.3 \％\％）。由于缺乏数据及其与行李混乱的相似之处，整体性能受{\ it glock框架}和{\ it iPods}的性能的影响不佳。

Automatic detection of prohibited objects within passenger baggage is important for aviation security. X-ray Computed Tomography (CT) based 3D imaging is widely used in airports for aviation security screening whilst prior work on automatic prohibited item detection focus primarily on 2D X-ray imagery. These works have proven the possibility of extending deep convolutional neural networks (CNN) based automatic prohibited item detection from 2D X-ray imagery to volumetric 3D CT baggage security screening imagery. However, previous work on 3D object detection in baggage security screening imagery focused on the detection of one specific type of objects (e.g., either {\it bottles} or {\it handguns}). As a result, multiple models are needed if more than one type of prohibited item is required to be detected in practice. In this paper, we consider the detection of multiple object categories of interest using one unified framework. To this end, we formulate a more challenging multi-class 3D object detection problem within 3D CT imagery and propose a viable solution (3D RetinaNet) to tackle this problem. To enhance the performance of detection we investigate a variety of strategies including data augmentation and varying backbone networks. Experimentation carried out to provide both quantitative and qualitative evaluations of the proposed approach to multi-class 3D object detection within 3D CT baggage security screening imagery. Experimental results demonstrate the combination of the 3D RetinaNet and a series of favorable strategies can achieve a mean Average Precision (mAP) of 65.3\% over five object classes (i.e. {\it bottles, handguns, binoculars, glock frames, iPods}). The overall performance is affected by the poor performance on {\it glock frames} and {\it iPods} due to the lack of data and their resemblance with the baggage clutter.

下载PDF全文

下载文献需遵守相关版权规定

论文标题