从人类情感中的标签关系中学习

论文标题

从人类情感中的标签关系中学习

Learning from Label Relationships in Human Affect

论文作者

Foteinopoulou, Niki Maria, Patras, Ioannis

论文摘要

人类的情感和心理状态以自动化的方式估计，面临许多困难，包括从较差或没有时间分辨率的标签中学习，从很少有数据的数据集中学习（通常是由于机密性约束），以及（非常）长时间的内部视频。由于这些原因，深度学习方法倾向于过度合适，也就是说，在最终回归任务上的概括性能较差。为了克服这一点，在这项工作中，我们介绍了两个互补的贡献。首先，我们引入了一种新型的关系损失，以解决多标签回归和序数问题，该问题正常学习并导致更好的概括。拟议的损失使用标签矢量间相关信息来学习更好的潜在表示，通过将批次标签距离与潜在特征空间中的距离保持一致。其次，我们利用了两个阶段的注意体系结构，该体系结构通过使用相邻夹的功能作为时间上下文来估算每个剪辑的目标。我们评估了有关连续影响和精神分裂症严重程度估计问题的拟议方法，因为两者之间存在方法论和上下文的相似之处。实验结果表明，所提出的方法的表现优于所有基准。在精神分裂症的领域中，所提出的方法的表现优于先前的最先前的利润率，其PCC的PCC绩效的绩效最高为78％（85％）（85％），并且比以前的作品高得多（高达40％的PCC）。在情感识别的情况下，我们在OMG和Amigos数据集上都以CCC为基础的先前基于以前的方法。对于Amigos而言，我们的唤醒和价值分别均优于先前的SOTA CCC，分别为9％和13％，而在OMG数据集中，我们的效果均优于先前的视力，唤醒和价值都高达5％。

Human affect and mental state estimation in an automated manner, face a number of difficulties, including learning from labels with poor or no temporal resolution, learning from few datasets with little data (often due to confidentiality constraints) and, (very) long, in-the-wild videos. For these reasons, deep learning methodologies tend to overfit, that is, arrive at latent representations with poor generalisation performance on the final regression task. To overcome this, in this work, we introduce two complementary contributions. First, we introduce a novel relational loss for multilabel regression and ordinal problems that regularises learning and leads to better generalisation. The proposed loss uses label vector inter-relational information to learn better latent representations by aligning batch label distances to the distances in the latent feature space. Second, we utilise a two-stage attention architecture that estimates a target for each clip by using features from the neighbouring clips as temporal context. We evaluate the proposed methodology on both continuous affect and schizophrenia severity estimation problems, as there are methodological and contextual parallels between the two. Experimental results demonstrate that the proposed methodology outperforms all baselines. In the domain of schizophrenia, the proposed methodology outperforms previous state-of-the-art by a large margin, achieving a PCC of up to 78% performance close to that of human experts (85%) and much higher than previous works (uplift of up to 40%). In the case of affect recognition, we outperform previous vision-based methods in terms of CCC on both the OMG and the AMIGOS datasets. Specifically for AMIGOS, we outperform previous SoTA CCC for both arousal and valence by 9% and 13% respectively, and in the OMG dataset we outperform previous vision works by up to 5% for both arousal and valence.

下载PDF全文

下载文献需遵守相关版权规定

论文标题