宜家ASM数据集：了解人们通过动作，物体和姿势组装家具

论文标题

宜家ASM数据集：了解人们通过动作，物体和姿势组装家具

The IKEA ASM Dataset: Understanding People Assembling Furniture through Actions, Objects and Pose

论文作者

Ben-Shabat, Yizhak, Yu, Xin, Saleh, Fatemeh Sadat, Campbell, Dylan, Rodriguez-Opazo, Cristian, Li, Hongdong, Gould, Stephen

论文摘要

大型标记数据集的可用性是应用深度学习方法来解决各种计算机视觉任务的关键要求。在理解人类活动的背景下，现有的公共数据集虽然规模较大，但通常仅限于单个RGB摄像机，并且仅提供人均或每张clip动作注释。为了实现对人类活动的更丰富的分析和理解，我们介绍了宜家ASM - 三百万个框架，多视图，家具组装视频数据集，其中包括深度，原子行动，对象细分和人类姿势。此外，我们基于此具有挑战性的数据集的视频动作识别，对象细分和人体姿势估计任务的突出方法。该数据集可以开发整体方法，该方法整合了多模式和多视图数据以更好地执行这些任务。

The availability of a large labeled dataset is a key requirement for applying deep learning methods to solve various computer vision tasks. In the context of understanding human activities, existing public datasets, while large in size, are often limited to a single RGB camera and provide only per-frame or per-clip action annotations. To enable richer analysis and understanding of human activities, we introduce IKEA ASM -- a three million frame, multi-view, furniture assembly video dataset that includes depth, atomic actions, object segmentation, and human pose. Additionally, we benchmark prominent methods for video action recognition, object segmentation and human pose estimation tasks on this challenging dataset. The dataset enables the development of holistic methods, which integrate multi-modal and multi-view data to better perform on these tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题