论文标题
通过Minenavi探索从数据集到单眼深度估计(MDE)模型的影响
Exploring the Impacts from Datasets to Monocular Depth Estimation (MDE) Models with MineNavi
论文作者
论文摘要
基于深度学习的当前计算机视觉任务需要大量的数据,并带有用于模型培训或测试的注释,尤其是在某些密集的估计任务中,例如光流分段和深度估计。实际上,密集估算任务的手动标记非常困难甚至不可能,并且数据集的场景通常仅限于较小的范围,这极大地限制了社区的发展。为了克服这种缺陷,我们提出了一种合成数据集生成方法,以获取无繁重的手动劳动力的可扩展数据集。通过这种方法,我们构建了一个名为Minenavi的数据集,该数据集包含来自飞机的第一镜头视频视频素材,并与准确的地面真相相匹配,以实现飞机导航应用中的深度估算。我们还提供定量实验,以证明通过Minenavi数据集进行预训练可以提高深度估计模型的性能,并加快模型在真实场景数据上的收敛性。由于合成数据集在深层模型的训练过程中与现实世界数据集具有相似的效果,因此我们还提供了具有单眼深度估计方法的其他实验,以证明各种因素在我们的数据集中的影响,例如照明条件和运动模式。
Current computer vision tasks based on deep learning require a huge amount of data with annotations for model training or testing, especially in some dense estimation tasks, such as optical flow segmentation and depth estimation. In practice, manual labeling for dense estimation tasks is very difficult or even impossible, and the scenes of the dataset are often restricted to a small range, which dramatically limits the development of the community. To overcome this deficiency, we propose a synthetic dataset generation method to obtain the expandable dataset without burdensome manual workforce. By this method, we construct a dataset called MineNavi containing video footages from first-perspective-view of the aircraft matched with accurate ground truth for depth estimation in aircraft navigation application. We also provide quantitative experiments to prove that pre-training via our MineNavi dataset can improve the performance of depth estimation model and speed up the convergence of the model on real scene data. Since the synthetic dataset has a similar effect to the real-world dataset in the training process of deep model, we also provide additional experiments with monocular depth estimation method to demonstrate the impact of various factors in our dataset such as lighting conditions and motion mode.