论文标题

DeepLidarFlow:使用单眼相机和稀疏LIDAR的深度学习架构,用于场景流量估算

DeepLiDARFlow: A Deep Learning Architecture For Scene Flow Estimation Using Monocular Camera and Sparse LiDAR

论文作者

Rishav, Battrawy, Ramy, Schuster, René, Wasenmüller, Oliver, Stricker, Didier

论文摘要

场景流是场景的运动和几何形状的密集3D重建。大多数最先进的方法都使用一对立体声图像作为完整场景重建的输入。这些方法在很大程度上取决于RGB图像的质量,并且在具有反射物体,阴影,条件不足的光环境等地区的性能差。激光雷达测量值对上述条件的敏感程度要少得多,但由于其稀疏性质,激光雷达的特征通常不适合匹配任务。因此,使用LIDAR和RGB都可以通过相互改进并产生可改善匹配过程的强大特征来克服每个传感器的各个缺点。在本文中,我们提出了DeepLidarflow,这是一种新颖的深度学习体系结构,在单眼设置中以多个尺度融合了高水平的RGB和LIDAR特征,以预测茂密的场景流。在仅限图像和唯一的激光措施不准确的关键区域中,其性能要好得多。我们使用已建立的数据集Kitti和FlyingThings3D来验证我们的DeepLidarFlow,与使用其他输入方式的几种最新方法相比,我们表现出强大的鲁棒性。我们论文的代码可在https://github.com/dfki-av/deeplidarflow上找到。

Scene flow is the dense 3D reconstruction of motion and geometry of a scene. Most state-of-the-art methods use a pair of stereo images as input for full scene reconstruction. These methods depend a lot on the quality of the RGB images and perform poorly in regions with reflective objects, shadows, ill-conditioned light environment and so on. LiDAR measurements are much less sensitive to the aforementioned conditions but LiDAR features are in general unsuitable for matching tasks due to their sparse nature. Hence, using both LiDAR and RGB can potentially overcome the individual disadvantages of each sensor by mutual improvement and yield robust features which can improve the matching process. In this paper, we present DeepLiDARFlow, a novel deep learning architecture which fuses high level RGB and LiDAR features at multiple scales in a monocular setup to predict dense scene flow. Its performance is much better in the critical regions where image-only and LiDAR-only methods are inaccurate. We verify our DeepLiDARFlow using the established data sets KITTI and FlyingThings3D and we show strong robustness compared to several state-of-the-art methods which used other input modalities. The code of our paper is available at https://github.com/dfki-av/DeepLiDARFlow.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源