城市尺度的视觉位置识别具有基于多尺度有序vlad boming的深度本地功能

论文标题

城市尺度的视觉位置识别具有基于多尺度有序vlad boming的深度本地功能

City-Scale Visual Place Recognition with Deep Local Features Based on Multi-Scale Ordered VLAD Pooling

论文作者

Le, Duc Canh, Youn, Chan Hyun

论文摘要

视觉位置识别是识别图像中所描绘的位置的任务，该位置基于其纯粹的视觉外观而没有元数据。在视觉场所的识别中，挑战不仅在于照明条件，相机观点和比例的变化，还在于场景级图像的特征以及该区域的独特特征。为了解决这些挑战，必须考虑局部歧视性和图像的全球语义背景。另一方面，数据集的多样性对于开发更一般的模型并提高领域的进步尤其重要。在本文中，我们提出了一个完全自动化的系统，用于基于基于内容的图像检索的城市规模识别。我们对社区的主要贡献在三个方面。首先，与一般图像检索任务相比，我们对视觉位置识别进行了全面分析，并勾勒出任务的独特挑战。接下来，我们在卷积神经网络激活之上提出了一种简单的合并方法，将空间信息嵌入到图像表示向量中。最后，我们介绍了新数据集以供位置识别，这对于基于应用程序的研究尤其重要。此外，在广泛的实验中，对图像检索和地点识别的各种问题进行了分析和讨论，以提供一些见解，以提高现实中检索模型的性能。本文中使用的数据集可以在https://github.com/canhld94/daejeon520上找到

Visual place recognition is the task of recognizing a place depicted in an image based on its pure visual appearance without metadata. In visual place recognition, the challenges lie upon not only the changes in lighting conditions, camera viewpoint, and scale but also the characteristic of scene-level images and the distinct features of the area. To resolve these challenges, one must consider both the local discriminativeness and the global semantic context of images. On the other hand, the diversity of the datasets is also particularly important to develop more general models and advance the progress of the field. In this paper, we present a fully-automated system for place recognition at a city-scale based on content-based image retrieval. Our main contributions to the community lie in three aspects. Firstly, we take a comprehensive analysis of visual place recognition and sketch out the unique challenges of the task compared to general image retrieval tasks. Next, we propose yet a simple pooling approach on top of convolutional neural network activations to embed the spatial information into the image representation vector. Finally, we introduce new datasets for place recognition, which are particularly essential for application-based research. Furthermore, throughout extensive experiments, various issues in both image retrieval and place recognition are analyzed and discussed to give some insights into improving the performance of retrieval models in reality. The dataset used in this paper can be found at https://github.com/canhld94/Daejeon520

下载PDF全文

下载文献需遵守相关版权规定

论文标题