半监督骨骼动作识别的联合骨融合图卷积网络

论文标题

半监督骨骼动作识别的联合骨融合图卷积网络

Joint-bone Fusion Graph Convolutional Network for Semi-supervised Skeleton Action Recognition

论文作者

Tu, Zhigang, Zhang, Jiaxu, Li, Hongyan, Chen, Yujin, Yuan, Junsong

论文摘要

近年来，图形卷积网络（GCN）在基于骨架的人类行动识别中起着越来越重要的作用。但是，大多数基于GCN的方法仍然存在两个主要局限性：1）他们仅考虑关节的运动信息或分别处理关节和骨骼，这些信息无法完全探索关节和骨骼之间的潜在功能相关性，以识别行动。 2）这些作品中的大多数都是以监督的学习方式进行的，这在很大程度上依赖于大量标记的培训数据。为了解决这些问题，我们提出了一种基于半监督骨架的动作识别方法，该方法以前很少被利用。我们将新型相关驱动的关节式融合图卷积网络（CD-JBF-GCN）设计为编码器，并使用姿势预测头作为解码器来实现半监督的学习。具体而言，CD-JBF-GC可以探索关节流和骨流之间的运动传递，从而促进这两个流以学习更多歧视性特征表示。基于姿势预测的自动编码器在自我监督的训练阶段中使网络可以从未标记的数据中学习运动表示，这对于行动识别至关重要。在两个流行的数据集（即NTU-RGB+D和动力学）上进行的广泛实验表明，我们的模型可实现基于半监视骨架的动作识别的最新性能，并且也可用于完全审议的方法。

In recent years, graph convolutional networks (GCNs) play an increasingly critical role in skeleton-based human action recognition. However, most GCN-based methods still have two main limitations: 1) They only consider the motion information of the joints or process the joints and bones separately, which are unable to fully explore the latent functional correlation between joints and bones for action recognition. 2) Most of these works are performed in the supervised learning way, which heavily relies on massive labeled training data. To address these issues, we propose a semi-supervised skeleton-based action recognition method which has been rarely exploited before. We design a novel correlation-driven joint-bone fusion graph convolutional network (CD-JBF-GCN) as an encoder and use a pose prediction head as a decoder to achieve semi-supervised learning. Specifically, the CD-JBF-GC can explore the motion transmission between the joint stream and the bone stream, so that promoting both streams to learn more discriminative feature representations. The pose prediction based auto-encoder in the self-supervised training stage allows the network to learn motion representation from unlabeled data, which is essential for action recognition. Extensive experiments on two popular datasets, i.e. NTU-RGB+D and Kinetics-Skeleton, demonstrate that our model achieves the state-of-the-art performance for semi-supervised skeleton-based action recognition and is also useful for fully-supervised methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题