基于骨架的动作识别使用堆叠的Denoising AutoCododer，并具有特权信息的限制

论文标题

基于骨架的动作识别使用堆叠的Denoising AutoCododer，并具有特权信息的限制

Skeleton Based Action Recognition using a Stacked Denoising Autoencoder with Constraints of Privileged Information

论文作者

Wu, Zhize, Weise, Thomas, Zou, Le, Sun, Fei, Tan, Ming

论文摘要

最近，随着具有成本效益的深度相机以及实时骨架估计的可用性，对基于骨架的人类动作识别的兴趣得到了续签。大多数现有的骨骼表示方法都使用联合位置或动力学模型。与以前的研究不同，我们提出了一种新方法，称为具有时间和分类约束（dae_ctc）}的新方法，以研究骨骼重建的视图中研究骨骼表示。基于在特权信息下学习的概念，我们将动作类别和时间坐标整合到培训阶段的堆叠式Denoising自动编码器中，以保留类别和时间功能，同时从骨架中学习隐藏的表示。因此，我们能够提高隐藏表示形式的歧视有效性。为了减轻暂时未对准产生的变化，提出了一种新的时间登记方法，称为局部序列序列注册（LWSR），用于注册阶层间和内部动作的序列。我们最终使用傅立叶时间金字塔（FTP）表示序列，并使用LWSR注册，FTP表示和线性支持向量机（SVM）的组合进行分类。三个动作数据集的实验结果，即MSR-ACTION3D，UTKINECT-ACTION和FLORENCE3D-ACTION，表明我们的建议的性能要比许多现有方法更好，并且与艺术的状态相当。

Recently, with the availability of cost-effective depth cameras coupled with real-time skeleton estimation, the interest in skeleton-based human action recognition is renewed. Most of the existing skeletal representation approaches use either the joint location or the dynamics model. Differing from the previous studies, we propose a new method called Denoising Autoencoder with Temporal and Categorical Constraints (DAE_CTC)} to study the skeletal representation in a view of skeleton reconstruction. Based on the concept of learning under privileged information, we integrate action categories and temporal coordinates into a stacked denoising autoencoder in the training phase, to preserve category and temporal feature, while learning the hidden representation from a skeleton. Thus, we are able to improve the discriminative validity of the hidden representation. In order to mitigate the variation resulting from temporary misalignment, a new method of temporal registration, called Locally-Warped Sequence Registration (LWSR), is proposed for registering the sequences of inter- and intra-class actions. We finally represent the sequences using a Fourier Temporal Pyramid (FTP) representation and perform classification using a combination of LWSR registration, FTP representation, and a linear Support Vector Machine (SVM). The experimental results on three action data sets, namely MSR-Action3D, UTKinect-Action, and Florence3D-Action, show that our proposal performs better than many existing methods and comparably to the state of the art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题