空中图像中的旋转对象检测

论文标题

空中图像中的旋转对象检测

Rotated Object Detection via Scale-invariant Mahalanobis Distance in Aerial Images

论文作者

Wen, Siyang, Guo, Wei, Liu, Yi, Wu, Ruijie

论文摘要

航空图像中的旋转对象检测是一项有意义但充满挑战的任务，因为对象被密集布置并具有任意方向。旋转对象检测中的八参数（盒子向量的坐标）通常使用LN - 标准损失（L1损失，L2损失和光滑的L1损失）作为损耗函数。由于LN - 标准损失主要基于非尺度不变的Minkowski距离，因此使用LN-NORM损失将导致与检测度量旋转旋转相交联合会（IOU）和训练不稳定的不一致。为了解决问题，我们使用Mahalanobis距离来计算预测框顶点向量之间的损失，并提出了一个新的损失函数，称为Mahalanobis距离损失（MDL），以进行八参数旋转的对象检测。由于Mahalanobis距离是规模不变的，因此MDL与检测度量更一致，并且在训练过程中比LN-Norm损失更稳定。为了减轻边界不连续性的问题，就像所有其他八个参数方法一样，我们进一步采取最小损耗值，以使MDL在边界案例下连续。我们使用提出的方法MDL在DOTA-V1.0上实现最先进的性能。此外，与使用光滑L1损失的实验相比，我们发现MDL在旋转对象检测中的性能更好。

Rotated object detection in aerial images is a meaningful yet challenging task as objects are densely arranged and have arbitrary orientations. The eight-parameter (coordinates of box vectors) methods in rotated object detection usually use ln-norm losses (L1 loss, L2 loss, and smooth L1 loss) as loss functions. As ln-norm losses are mainly based on non-scale-invariant Minkowski distance, using ln-norm losses will lead to inconsistency with the detection metric rotational Intersection-over-Union (IoU) and training instability. To address the problems, we use Mahalanobis distance to calculate loss between the predicted and the target box vertices' vectors, proposing a new loss function called Mahalanobis Distance Loss (MDL) for eight-parameter rotated object detection. As Mahalanobis distance is scale-invariant, MDL is more consistent with detection metric and more stable during training than ln-norm losses. To alleviate the problem of boundary discontinuity like all other eight-parameter methods, we further take the minimum loss value to make MDL continuous at boundary cases. We achieve state-of-art performance on DOTA-v1.0 with the proposed method MDL. Furthermore, compared to the experiment that uses smooth L1 loss, we find that MDL performs better in rotated object detection.

下载PDF全文

下载文献需遵守相关版权规定

论文标题