有效的动作识别，并具有嵌入式关键点变化

论文标题

有效的动作识别，并具有嵌入式关键点变化

Effective Action Recognition with Embedded Key Point Shifts

论文作者

Cao, Haozhi, Xu, Yuecong, Yang, Jianfei, Mao, Kezhi, Yin, Jianxiong, See, Simon

论文摘要

时间特征提取是基于视频动作识别的必不可少的技术。基于骨架的动作识别方法已使用了要点，但它们需要昂贵的关键点注释。在本文中，我们提出了一个新颖的时间特征提取模块，称为“键盘变化”模块（$ kpsem $），以适应跨视频帧的通道键点移动，而无需用于时间特征提取的密钥点注释。将关键点自适应地提取为特征点，在拆分区域具有最大特征值，而关键点偏移是相应关键点的空间位移。关键点偏移以多种方式通过线性嵌入层编码为整体时间特征。我们的方法通过将关键点转移以微不足道的计算成本嵌入，在Mini-Kinetics上实现了82.05％的最先进性能，并在UCF101，Something-Something-something-v1和HMDB51数据集上实现了竞争性能。

Temporal feature extraction is an essential technique in video-based action recognition. Key points have been utilized in skeleton-based action recognition methods but they require costly key point annotation. In this paper, we propose a novel temporal feature extraction module, named Key Point Shifts Embedding Module ($KPSEM$), to adaptively extract channel-wise key point shifts across video frames without key point annotation for temporal feature extraction. Key points are adaptively extracted as feature points with maximum feature values at split regions, while key point shifts are the spatial displacements of corresponding key points. The key point shifts are encoded as the overall temporal features via linear embedding layers in a multi-set manner. Our method achieves competitive performance through embedding key point shifts with trivial computational cost, achieving the state-of-the-art performance of 82.05% on Mini-Kinetics and competitive performance on UCF101, Something-Something-v1, and HMDB51 datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题