以3D骨架的人重新识别的自我监督步态编码方法具有本地意识

论文标题

以3D骨架的人重新识别的自我监督步态编码方法具有本地意识

A Self-Supervised Gait Encoding Approach with Locality-Awareness for 3D Skeleton Based Person Re-Identification

论文作者

Rao, Haocong, Wang, Siqi, Hu, Xiping, Tan, Mingkui, Guo, Yi, Cheng, Jun, Liu, Xinwang, Hu, Bin

论文摘要

通过3D骨架序列中的步态特征重新识别（RE-ID）是一个新出现的主题，具有多个优势。现有解决方案要么依赖手工制作的描述符或监督的步态表示学习。本文提出了一种自我监督的步态编码方法，可以利用未标记的骨骼数据来学习人的步态表示。具体来说，我们首先通过学习重建未标记的骨骼序列来创建自我练习，这涉及更丰富的高级语义以获得更好的步态表示。还探索了其他借口任务，以进一步改善自我监督的学习。其次，受到运动的连续性将相邻的骨骼固定在一个骨骼序列和临时连续的骨骼序列的启发下，我们提出了一种局部性意识到的注意力机制，并且在自我范围内保持范围的级别，我们提出了一个局部性意识到的关注机制，并在自我范围内保持范围的范围。最后，借助我们本地感知的注意机制和对比学习方案所学到的上下文向量，一个名为“约束注意的步态编码”（CAGES）的新型功能旨在有效地表示步态。经验评估表明，我们的方法显着优于基于骨架的同行，即排名1的准确性15-40％，甚至可以在具有额外的RGB或深度信息的众多多模式方法上实现出色的性能。我们的代码可从https://github.com/kali-hac/locality-awareness-sge获得。

Person re-identification (Re-ID) via gait features within 3D skeleton sequences is a newly-emerging topic with several advantages. Existing solutions either rely on hand-crafted descriptors or supervised gait representation learning. This paper proposes a self-supervised gait encoding approach that can leverage unlabeled skeleton data to learn gait representations for person Re-ID. Specifically, we first create self-supervision by learning to reconstruct unlabeled skeleton sequences reversely, which involves richer high-level semantics to obtain better gait representations. Other pretext tasks are also explored to further improve self-supervised learning. Second, inspired by the fact that motion's continuity endows adjacent skeletons in one skeleton sequence and temporally consecutive skeleton sequences with higher correlations (referred as locality in 3D skeleton data), we propose a locality-aware attention mechanism and a locality-aware contrastive learning scheme, which aim to preserve locality-awareness on intra-sequence level and inter-sequence level respectively during self-supervised learning. Last, with context vectors learned by our locality-aware attention mechanism and contrastive learning scheme, a novel feature named Constrastive Attention-based Gait Encodings (CAGEs) is designed to represent gait effectively. Empirical evaluations show that our approach significantly outperforms skeleton-based counterparts by 15-40% Rank-1 accuracy, and it even achieves superior performance to numerous multi-modal methods with extra RGB or depth information. Our codes are available at https://github.com/Kali-Hac/Locality-Awareness-SGE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题