HI-LASSIE：稀疏图像合奏中的高保真表达形状和骨骼发现

论文标题

HI-LASSIE：稀疏图像合奏中的高保真表达形状和骨骼发现

Hi-LASSIE: High-Fidelity Articulated Shape and Skeleton Discovery from Sparse Image Ensemble

论文作者

Yao, Chun-Han, Hung, Wei-Chih, Li, Yuanzhen, Rubinstein, Michael, Yang, Ming-Hsuan, Jampani, Varun

论文摘要

自动估计3D骨架，形状，摄像头观点以及稀疏野外图像合奏中的部分发音是一个严重构成的严重约束和具有挑战性的问题。大多数先前的方法都依赖于大规模的图像数据集，密集的时间对应关系或人类注释，例如相机姿势，2D关键点和形状模板。我们提出了Hi-Lassie，该Hi-Lassie仅在野外的20-30张在线图像中执行3D明显的重建，而没有任何用户定义的形状或骨架模板。我们遵循Lassie最近解决类似问题的工作，并取得了两个重大进展。首先，我们无需依靠手动注释的3D骨架，而是自动从所选参考图像中估算一个特定的骨骼。其次，我们通过新颖的实例特定优化策略改善了形状重建，这些策略允许重建在每个实例上忠实地适合，同时保留所有图像中所学的特定班级先验。野外图像集合的实验表明，尽管需要最少的用户输入，但Hi-Lassie获得了更高的保真度3D重建。

Automatically estimating 3D skeleton, shape, camera viewpoints, and part articulation from sparse in-the-wild image ensembles is a severely under-constrained and challenging problem. Most prior methods rely on large-scale image datasets, dense temporal correspondence, or human annotations like camera pose, 2D keypoints, and shape templates. We propose Hi-LASSIE, which performs 3D articulated reconstruction from only 20-30 online images in the wild without any user-defined shape or skeleton templates. We follow the recent work of LASSIE that tackles a similar problem setting and make two significant advances. First, instead of relying on a manually annotated 3D skeleton, we automatically estimate a class-specific skeleton from the selected reference image. Second, we improve the shape reconstructions with novel instance-specific optimization strategies that allow reconstructions to faithful fit on each instance while preserving the class-specific priors learned across all images. Experiments on in-the-wild image ensembles show that Hi-LASSIE obtains higher fidelity state-of-the-art 3D reconstructions despite requiring minimum user input.

下载PDF全文

下载文献需遵守相关版权规定

论文标题