论文标题
空中图像中的旋转对象检测
Rotated Object Detection via Scale-invariant Mahalanobis Distance in Aerial Images
论文作者
论文摘要
航空图像中的旋转对象检测是一项有意义但充满挑战的任务,因为对象被密集布置并具有任意方向。旋转对象检测中的八参数(盒子向量的坐标)通常使用LN - 标准损失(L1损失,L2损失和光滑的L1损失)作为损耗函数。由于LN - 标准损失主要基于非尺度不变的Minkowski距离,因此使用LN-NORM损失将导致与检测度量旋转旋转相交联合会(IOU)和训练不稳定的不一致。为了解决问题,我们使用Mahalanobis距离来计算预测框顶点向量之间的损失,并提出了一个新的损失函数,称为Mahalanobis距离损失(MDL),以进行八参数旋转的对象检测。由于Mahalanobis距离是规模不变的,因此MDL与检测度量更一致,并且在训练过程中比LN-Norm损失更稳定。为了减轻边界不连续性的问题,就像所有其他八个参数方法一样,我们进一步采取最小损耗值,以使MDL在边界案例下连续。我们使用提出的方法MDL在DOTA-V1.0上实现最先进的性能。此外,与使用光滑L1损失的实验相比,我们发现MDL在旋转对象检测中的性能更好。
Rotated object detection in aerial images is a meaningful yet challenging task as objects are densely arranged and have arbitrary orientations. The eight-parameter (coordinates of box vectors) methods in rotated object detection usually use ln-norm losses (L1 loss, L2 loss, and smooth L1 loss) as loss functions. As ln-norm losses are mainly based on non-scale-invariant Minkowski distance, using ln-norm losses will lead to inconsistency with the detection metric rotational Intersection-over-Union (IoU) and training instability. To address the problems, we use Mahalanobis distance to calculate loss between the predicted and the target box vertices' vectors, proposing a new loss function called Mahalanobis Distance Loss (MDL) for eight-parameter rotated object detection. As Mahalanobis distance is scale-invariant, MDL is more consistent with detection metric and more stable during training than ln-norm losses. To alleviate the problem of boundary discontinuity like all other eight-parameter methods, we further take the minimum loss value to make MDL continuous at boundary cases. We achieve state-of-art performance on DOTA-v1.0 with the proposed method MDL. Furthermore, compared to the experiment that uses smooth L1 loss, we find that MDL performs better in rotated object detection.