Collossl：为人类活动识别的协作自我监督学习

论文标题

Collossl：为人类活动识别的协作自我监督学习

ColloSSL: Collaborative Self-Supervised Learning for Human Activity Recognition

论文作者

Jain, Yash, Tang, Chi Ian, Min, Chulhong, Kawsar, Fahim, Mathur, Akhil

论文摘要

训练强大的人类活动识别模型（HAR）中的主要瓶颈是需要大规模标记的传感器数据集。由于标记大量传感器数据是一项昂贵的任务，因此出现了无监督和半监督的学习技术，可以从数据中学习良好的功能而无需任何标签。在本文中，我们扩展了这一研究线，并提出了一种称为协作自学学习（Collossl）的新技术，该技术利用了从用户佩戴的多个设备收集的未标记数据来学习数据的高质量功能。基于Collossl设计的一个关键见解是，可以将多个设备同时捕获的未标记的传感器数据集视为彼此的自然转换，并利用以生成监督信号来进行表示。我们提出了三项技术创新，以将常规的自学学习算法扩展到多设备设置：一种设备选择方法，该方法选择了正面和负面设备以实现对比度学习，一种对比度抽样算法，该算法示例了在多设备设置中示例正面和负面示例的正面和负面示例，并在多功能损失范围内将损失损失损失范围扩展到远程差构成标准的标准构图。我们对三个多设备数据集的实验结果表明，在大多数实验环境中，碰撞的表现都优于完全监督和半监督的学习技术，与表现最佳的基线相比，F_1分数的绝对增长最高为7.9％。我们还表明，在最佳情况下，仅使用十分之一的可用标记数据，就可以在低数据级方面优于低数据制度中的全面监督方法。

A major bottleneck in training robust Human-Activity Recognition models (HAR) is the need for large-scale labeled sensor datasets. Because labeling large amounts of sensor data is an expensive task, unsupervised and semi-supervised learning techniques have emerged that can learn good features from the data without requiring any labels. In this paper, we extend this line of research and present a novel technique called Collaborative Self-Supervised Learning (ColloSSL) which leverages unlabeled data collected from multiple devices worn by a user to learn high-quality features of the data. A key insight that underpins the design of ColloSSL is that unlabeled sensor datasets simultaneously captured by multiple devices can be viewed as natural transformations of each other, and leveraged to generate a supervisory signal for representation learning. We present three technical innovations to extend conventional self-supervised learning algorithms to a multi-device setting: a Device Selection approach which selects positive and negative devices to enable contrastive learning, a Contrastive Sampling algorithm which samples positive and negative examples in a multi-device setting, and a loss function called Multi-view Contrastive Loss which extends standard contrastive loss to a multi-device setting. Our experimental results on three multi-device datasets show that ColloSSL outperforms both fully-supervised and semi-supervised learning techniques in majority of the experiment settings, resulting in an absolute increase of upto 7.9% in F_1 score compared to the best performing baselines. We also show that ColloSSL outperforms the fully-supervised methods in a low-data regime, by just using one-tenth of the available labeled data in the best case.

下载PDF全文

下载文献需遵守相关版权规定

论文标题