将深度转移学习与信号图像编码相结合，用于多模式心理健康分类

论文标题

将深度转移学习与信号图像编码相结合，用于多模式心理健康分类

Combining Deep Transfer Learning with Signal-image Encoding for Multi-Modal Mental Wellbeing Classification

论文作者

Woodward, Kieran, Kanjo, Eiman, Tsanas, Athanasios

论文摘要

情绪状态的量化是理解健康的重要步骤。事实证明，来自多种模式（例如生理和运动传感器数据）的时间序列数据是测量和量化情绪的组成部分。长时间监测情绪轨迹会继承与训练数据大小有关的一些临界局限性。这种缺点可能会阻碍可靠，准确的机器学习模型的发展。为了解决这个问题，本文提出了一个框架，以解决在多个多模式数据集上执行情绪状态识别的限制：1）将多元时间序列数据编码到彩色图像中； 2）利用预先训练的对象识别模型使用步骤1的图像应用转移学习（TL）方法； 3）利用1D卷积神经网络（CNN）从生理数据中进行情绪分类； 4）将预训练的TL模型与1D CNN串联。此外，通过最初使用大型体育活动数据集训练1D CNN，然后将学习知识应用于目标数据集，从而探索了从生理数据中推断压力的可能性。我们证明，可以使用我们的框架来提高以5点李克特量表的真实世界福祉时的模型性能，从而提高高达98.5％的精度，从而优于常规CNN 4.5％。使用相同方法的主题独立模型的平均精度为72.3％（SD 0.038）。提出的基于CNN-TL的方法可以克服小型培训数据集的问题，从而改善了常规深度学习方法的性能。

The quantification of emotional states is an important step to understanding wellbeing. Time series data from multiple modalities such as physiological and motion sensor data have proven to be integral for measuring and quantifying emotions. Monitoring emotional trajectories over long periods of time inherits some critical limitations in relation to the size of the training data. This shortcoming may hinder the development of reliable and accurate machine learning models. To address this problem, this paper proposes a framework to tackle the limitation in performing emotional state recognition on multiple multimodal datasets: 1) encoding multivariate time series data into coloured images; 2) leveraging pre-trained object recognition models to apply a Transfer Learning (TL) approach using the images from step 1; 3) utilising a 1D Convolutional Neural Network (CNN) to perform emotion classification from physiological data; 4) concatenating the pre-trained TL model with the 1D CNN. Furthermore, the possibility of performing TL to infer stress from physiological data is explored by initially training a 1D CNN using a large physical activity dataset and then applying the learned knowledge to the target dataset. We demonstrate that model performance when inferring real-world wellbeing rated on a 5-point Likert scale can be enhanced using our framework, resulting in up to 98.5% accuracy, outperforming a conventional CNN by 4.5%. Subject-independent models using the same approach resulted in an average of 72.3% accuracy (SD 0.038). The proposed CNN-TL-based methodology may overcome problems with small training datasets, thus improving on the performance of conventional deep learning methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题