具有自适应视觉方式选择的有效的深视觉和惯性探测仪

论文标题

具有自适应视觉方式选择的有效的深视觉和惯性探测仪

Efficient Deep Visual and Inertial Odometry with Adaptive Visual Modality Selection

论文作者

Yang, Mingyu, Chen, Yu, Kim, Hun-Seok

论文摘要

近年来，基于深度学习的视觉惯性进程（VIO）的方法表现出色的性能优于传统几何方法。然而，所有现有方法都使用视觉和惯性测量，以产生潜在的计算冗余。虽然视觉数据处理比惯性测量单元（IMU）要贵得多，但它可能并不总是有助于提高姿势估计的准确性。在本文中，我们提出了一种基于自适应的深度学习VIO方法，该方法通过机会主义地破坏视觉方式来降低计算冗余。具体来说，我们训练一个策略网络，该网络学会根据当前的运动状态和IMU读数即可随时停用视觉功能提取器。采用了一个gumbel-softmax技巧来训练策略网络，以使决策过程可在端到端的系统培训中进行区分。学到的策略是可以解释的，它显示了降低自适应复杂性的方案依赖性决策模式。实验结果表明，我们的方法的性能比全模式基线的性能相似甚至更好，而KITTI数据集评估的计算复杂性降低高达78.8％。该代码可在https://github.com/mingyuyng/visual-selactive-vio中找到。

In recent years, deep learning-based approaches for visual-inertial odometry (VIO) have shown remarkable performance outperforming traditional geometric methods. Yet, all existing methods use both the visual and inertial measurements for every pose estimation incurring potential computational redundancy. While visual data processing is much more expensive than that for the inertial measurement unit (IMU), it may not always contribute to improving the pose estimation accuracy. In this paper, we propose an adaptive deep-learning based VIO method that reduces computational redundancy by opportunistically disabling the visual modality. Specifically, we train a policy network that learns to deactivate the visual feature extractor on the fly based on the current motion state and IMU readings. A Gumbel-Softmax trick is adopted to train the policy network to make the decision process differentiable for end-to-end system training. The learned strategy is interpretable, and it shows scenario-dependent decision patterns for adaptive complexity reduction. Experiment results show that our method achieves a similar or even better performance than the full-modality baseline with up to 78.8% computational complexity reduction for KITTI dataset evaluation. The code is available at https://github.com/mingyuyng/Visual-Selective-VIO.

下载PDF全文

下载文献需遵守相关版权规定

论文标题