低空城市环境中基于视觉的无人机自定义

论文标题

低空城市环境中基于视觉的无人机自定义

Vision-Based UAV Self-Positioning in Low-Altitude Urban Environments

论文作者

Dai, Ming, Zheng, Enhui, Feng, Zhenhua, Zhuang, Jiedong, Yang, Wankou

论文摘要

无人驾驶汽车（UAV）依靠卫星系统来稳定定位。但是，由于卫星覆盖范围有限或沟通中断，无人机可能会失去基于卫星的定位系统的信号。在这种情况下，基于视觉的技术可以作为替代方案，以确保无人机的自定义能力。但是，大多数现有数据集都是为无人机所标识的对象的地理定位任务而不是无人机的自定义任务而开发的。此外，当前的无人机数据集在诸如Google Maps之类的综合数据上使用离散抽样，从而忽略了密集采样的关键方面以及在现实世界中常见的不确定性。为了解决这些问题，本文介绍了一个新的数据集Denseuav，该数据集是第一个专门为无人机自定义任务设计的公开可用数据集。 Denseuav对在低空城市环境中获得的无人机图像进行了密集的采样。总共收集和注释了14个大学校园的27K无人机观看和卫星视图图像，建立了新的基准。在模型开发方面，我们首先在此任务中验证了变压器优于CNN的优势。然后，我们将度量学习纳入表示学习中，以增强模型的歧视能力并减少模式差异。此外，为了促进从这两个角度进行联合学习，我们提出了一种相互监督的学习方法。最后，我们增强了召回@K度量，并引入了新的测量SDM@K，以同时从检索和本地化的角度评估训练有素的模型的性能。结果，拟议的基线方法在Denseuav上获得了83.05％和SDM@1分86.24％的惊人召回率。该数据集和代码将在https://github.com/dmmmm1997/denseuav上公开提供。

Unmanned Aerial Vehicles (UAVs) rely on satellite systems for stable positioning. However, due to limited satellite coverage or communication disruptions, UAVs may lose signals from satellite-based positioning systems. In such situations, vision-based techniques can serve as an alternative, ensuring the self-positioning capability of UAVs. However, most of the existing datasets are developed for the geo-localization tasks of the objects identified by UAVs, rather than the self-positioning task of UAVs. Furthermore, the current UAV datasets use discrete sampling on synthetic data, such as Google Maps, thereby neglecting the crucial aspects of dense sampling and the uncertainties commonly experienced in real-world scenarios. To address these issues, this paper presents a new dataset, DenseUAV, which is the first publicly available dataset designed for the UAV self-positioning task. DenseUAV adopts dense sampling on UAV images obtained in low-altitude urban settings. In total, over 27K UAV-view and satellite-view images of 14 university campuses are collected and annotated, establishing a new benchmark. In terms of model development, we first verify the superiority of Transformers over CNNs in this task. Then, we incorporate metric learning into representation learning to enhance the discriminative capacity of the model and to lessen the modality discrepancy. Besides, to facilitate joint learning from both perspectives, we propose a mutually supervised learning approach. Last, we enhance the Recall@K metric and introduce a new measurement, SDM@K, to evaluate the performance of a trained model from both the retrieval and localization perspectives simultaneously. As a result, the proposed baseline method achieves a remarkable Recall@1 score of 83.05% and an SDM@1 score of 86.24% on DenseUAV. The dataset and code will be made publicly available on https://github.com/Dmmm1997/DenseUAV.

下载PDF全文

下载文献需遵守相关版权规定

论文标题