论文标题

分析和计算复杂性降低单眼和立体深度估计技术

Analysis & Computational Complexity Reduction of Monocular and Stereo Depth Estimation Techniques

论文作者

Patwari, Rajeev, Ly, Varo

论文摘要

对于无人机和电池操作的自动驾驶系统,具有最低计算和能源成本的准确深度估计是至关重要的。机器人应用需要在快速变化的3D周围环境下进行导航和决策的实时深度估算。高精度算法可能会提供最佳的深度估计,但可能会消耗巨大的计算和能源资源。一般的权衡是选择较少准确的方法来进行初始深度估计,并在需要时选择更准确但更加密集的方法。以前的工作表明,可以通过开发最先进的方法(AnyNet)来改善立体声深度估计来改善这种权衡。 我们研究了单眼和立体视觉深度估计方法,并研究了降低这些方法计算复杂性的方法。这是我们的基线。因此,我们的实验表明,单眼深度估计模型的大小降低了约75%,将精度降低了不到2%(SSIM度量)。我们对新型立体声视觉方法(AnyNET)进行的实验表明,尽管模型大小的大小降低了约20%,但深度估计的准确性并不会降低3%以上(三个像素误差度量)。我们已经表明,较小的模型确实可以竞争性能。

Accurate depth estimation with lowest compute and energy cost is a crucial requirement for unmanned and battery operated autonomous systems. Robotic applications require real time depth estimation for navigation and decision making under rapidly changing 3D surroundings. A high accuracy algorithm may provide the best depth estimation but may consume tremendous compute and energy resources. A general trade-off is to choose less accurate methods for initial depth estimate and a more accurate yet compute intensive method when needed. Previous work has shown this trade-off can be improved by developing a state-of-the-art method (AnyNet) to improve stereo depth estimation. We studied both the monocular and stereo vision depth estimation methods and investigated methods to reduce computational complexity of these methods. This was our baseline. Consequently, our experiments show reduction of monocular depth estimation model size by ~75% reduces accuracy by less than 2% (SSIM metric). Our experiments with the novel stereo vision method (AnyNet) show that accuracy of depth estimation does not degrade more than 3% (three pixel error metric) in spite of reduction in model size by ~20%. We have shown that smaller models can indeed perform competitively.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源