从相机运动和视频对象细分中学习对象深度

论文标题

从相机运动和视频对象细分中学习对象深度

Learning Object Depth from Camera Motion and Video Object Segmentation

论文作者

Griffin, Brent A., Corso, Jason J.

论文摘要

视频对象细分，即目标对象与视频背景的分离，近年来在真实和具有挑战性的视频上取得了重大进展。为了利用3D应用程序中的这一进度，本文解决了学习的问题，以估算摄像机运动的测量（例如，来自机器人运动学或车辆频能测定法）的分段对象的深度。首先，我们通过引入一个多样化的，可扩展的数据集来实现这一目标，其次是设计一个新颖的深网，该网络仅使用分割掩码和未校准的相机运动来估算对象的深度。我们的数据生成框架会创建人工对象分割，以缩放相机和对象之间距离的变化，我们的网络学会了即使在分段错误中也可以估算对象深度。我们使用机器人摄像头展示了跨域的方法，以从YCB数据集和车辆摄像头找到对象，以在驾驶时定位障碍物。

Video object segmentation, i.e., the separation of a target object from background in video, has made significant progress on real and challenging videos in recent years. To leverage this progress in 3D applications, this paper addresses the problem of learning to estimate the depth of segmented objects given some measurement of camera motion (e.g., from robot kinematics or vehicle odometry). We achieve this by, first, introducing a diverse, extensible dataset and, second, designing a novel deep network that estimates the depth of objects using only segmentation masks and uncalibrated camera movement. Our data-generation framework creates artificial object segmentations that are scaled for changes in distance between the camera and object, and our network learns to estimate object depth even with segmentation errors. We demonstrate our approach across domains using a robot camera to locate objects from the YCB dataset and a vehicle camera to locate obstacles while driving.

下载PDF全文

下载文献需遵守相关版权规定

论文标题