gimme信号：多模式活动识别的判别信号

论文标题

gimme信号：多模式活动识别的判别信号

Gimme Signals: Discriminative signal encoding for multimodal activity recognition

论文作者

Memmesheimer, Raphael, Theisen, Nick, Paulus, Dietrich

论文摘要

我们提出了一种简单但有效且灵活的方法，用于支持多种传感器方式的动作识别。多变量信号序列在图像中编码，然后使用最近提出的ExtricNet CNN体系结构进行分类。我们的重点是找到一种在没有特定适应的同时仍然取得良好结果的情况下，可以很好地跨越不同的传感器方式概括。我们将方法应用于包含骨架序列，惯性和运动捕获测量值以及\ wifi指纹的4个动作识别数据集，该数据集的范围高达120个动作类别。我们的方法定义了NTU RGB +D 120数据集的当前最佳基于CNN的方法，它将ARIL Wi-Fi数据集的最新状态提高到 +6.78％，将UTD-MHAD惯性基线提高 +14.4％，提高14.4％，UTD-MHAD-MHAD Skeleton基线，UTD-MHAD基线，并将sblieie Vaties（80/Achie caption Caption）占据96..11％（in Simbitt）。我们进一步证明了两者的实验，信号水平上的模态融合和信号降低以防止表示形式过载。

We present a simple, yet effective and flexible method for action recognition supporting multiple sensor modalities. Multivariate signal sequences are encoded in an image and are then classified using a recently proposed EfficientNet CNN architecture. Our focus was to find an approach that generalizes well across different sensor modalities without specific adaptions while still achieving good results. We apply our method to 4 action recognition datasets containing skeleton sequences, inertial and motion capturing measurements as well as \wifi fingerprints that range up to 120 action classes. Our method defines the current best CNN-based approach on the NTU RGB+D 120 dataset, lifts the state of the art on the ARIL Wi-Fi dataset by +6.78%, improves the UTD-MHAD inertial baseline by +14.4%, the UTD-MHAD skeleton baseline by 1.13% and achieves 96.11% on the Simitate motion capturing data (80/20 split). We further demonstrate experiments on both, modality fusion on a signal level and signal reduction to prevent the representation from overloading.

下载PDF全文

下载文献需遵守相关版权规定

论文标题