意外转盘：通过观察对象转动学习3D姿势

论文标题

意外转盘：通过观察对象转动学习3D姿势

Accidental Turntables: Learning 3D Pose by Watching Objects Turn

论文作者

Cheng, Zezhou, Gadelha, Matheus, Maji, Subhransu

论文摘要

我们提出了一种通过利用新的数据源（对象转动的野外视频）来学习单视3D对象姿势估计模型的技术。这样的视频在实践中很普遍（例如，回旋处的汽车，跑道附近的飞机），易于收集。我们表明，经典的结构算法，再加上实例检测和功能匹配的最新进展，可在此类视频中提供出人意料准确的相对3D姿势估计。我们提出了一种多阶段培训方案，该计划首先在一系列视频中学习一个规范的姿势，然后监督模型以进行单视姿势估计。提出的技术相对于3D姿势估计的标准基准测试的现有最新基准实现了竞争性能，而无需在培训过程中需要任何姿势标签。我们还贡献了一个偶然的转盘数据集，其中包含一组具有挑战性的41,212套汽车图像，在混乱的背景下，运动模糊和照明变化是3D姿势估计的基准。

We propose a technique for learning single-view 3D object pose estimation models by utilizing a new source of data -- in-the-wild videos where objects turn. Such videos are prevalent in practice (e.g., cars in roundabouts, airplanes near runways) and easy to collect. We show that classical structure-from-motion algorithms, coupled with the recent advances in instance detection and feature matching, provides surprisingly accurate relative 3D pose estimation on such videos. We propose a multi-stage training scheme that first learns a canonical pose across a collection of videos and then supervises a model for single-view pose estimation. The proposed technique achieves competitive performance with respect to existing state-of-the-art on standard benchmarks for 3D pose estimation, without requiring any pose labels during training. We also contribute an Accidental Turntables Dataset, containing a challenging set of 41,212 images of cars in cluttered backgrounds, motion blur and illumination changes that serves as a benchmark for 3D pose estimation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题