论文标题
实时人类机器人相互作用的注意力为引起注意的动作识别
Attention-Oriented Action Recognition for Real-Time Human-Robot Interaction
论文作者
论文摘要
尽管在行动识别任务中取得了显着进展,但专门针对人类机器人互动的行动识别中并没有做太多工作。在本文中,我们深入探讨了互动场景中的动作识别任务的特征,并提出了一个面向注意力的多层网络框架,以满足实时互动的需求。具体而言,采用注意力的网络首先以低分辨率大致关注场景中的相互作用者,然后在高分辨率下进行细粒度的姿势估计。另一个紧凑的CNN接收提取的骨骼序列作为动作识别的输入,利用注意力样机制有效地捕获局部时空模式和全局语义信息。为了评估我们的方法,我们在互动方案中构建了一个专门用于识别任务的新动作数据集。在移动计算平台上(NVIDIA JETSON AGX XAVIER)上的数据集和高效率(在640 x 480 rgbd处为112 fps)的实验结果表明,在实时人类机器人交互中,我们的方法在动作识别方面非常适用。
Despite the notable progress made in action recognition tasks, not much work has been done in action recognition specifically for human-robot interaction. In this paper, we deeply explore the characteristics of the action recognition task in interaction scenarios and propose an attention-oriented multi-level network framework to meet the need for real-time interaction. Specifically, a Pre-Attention network is employed to roughly focus on the interactor in the scene at low resolution firstly and then perform fine-grained pose estimation at high resolution. The other compact CNN receives the extracted skeleton sequence as input for action recognition, utilizing attention-like mechanisms to capture local spatial-temporal patterns and global semantic information effectively. To evaluate our approach, we construct a new action dataset specially for the recognition task in interaction scenarios. Experimental results on our dataset and high efficiency (112 fps at 640 x 480 RGBD) on the mobile computing platform (Nvidia Jetson AGX Xavier) demonstrate excellent applicability of our method on action recognition in real-time human-robot interaction.