地理：学习基于图形的多视图对象检测几何提示端到端

论文标题

地理：学习基于图形的多视图对象检测几何提示端到端

GeoGraph: Learning graph-based multi-view object detection with geometric cues end-to-end

论文作者

Nassar, Ahmed Samy, D'Aronco, Stefano, Lefèvre, Sébastien, Wegner, Jan D.

论文摘要

在本文中，我们提出了一种端到端可学习的方法，该方法从多个视图中检测静态城市对象，重新识别实例，并最终每个对象分配地理位置。我们的方法依赖于图形神经网络（GNN）来给定图像和近似相机姿势作为输入，检测所有对象并输出其地理位置。我们的GNN同时建模相对姿势和图像证据，并进一步能够处理任意数量的输入视图。我们的方法对遮挡具有鲁棒性，相邻物体的外观相似，并且通过共同推理视觉图像外观和相对姿势，观点的严重变化。对两个具有挑战性的大规模数据集的实验评估以及与最先进的方法的比较，在准确性和效率方面都有显着和系统的改进，检测和重新启动的平均精度为2-6％，培训时间减少了8倍。

In this paper we propose an end-to-end learnable approach that detects static urban objects from multiple views, re-identifies instances, and finally assigns a geographic position per object. Our method relies on a Graph Neural Network (GNN) to, detect all objects and output their geographic positions given images and approximate camera poses as input. Our GNN simultaneously models relative pose and image evidence, and is further able to deal with an arbitrary number of input views. Our method is robust to occlusion, with similar appearance of neighboring objects, and severe changes in viewpoints by jointly reasoning about visual image appearance and relative pose. Experimental evaluation on two challenging, large-scale datasets and comparison with state-of-the-art methods show significant and systematic improvements both in accuracy and efficiency, with 2-6% gain in detection and re-ID average precision as well as 8x reduction of training time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题