论文标题
对情感数字表型的积极学习的探索
An Exploration of Active Learning for Affective Digital Phenotyping
论文作者
论文摘要
一些最严重的瓶颈,阻止了针对人类行为的机器学习模型的广泛发展,包括缺乏标记的培训数据和获得高质量标签的困难。主动学习是使用算法在计算上选择一个有用的数据点子集来使用指标来标记模型不确定性和数据相似性标记的范式。我们探索自然主义计算机视觉情感数据的积极学习,这是由于固有主观标签而引起的一个特别异质和复杂的数据空间。使用从为自闭症儿童的治疗智能手机游戏中获取的游戏玩法中收集的框架,我们使用游戏提示作为元数据进行了积极学习的模拟,以辅助积极学习过程。我们发现,使用游戏玩法中生成的信息略优于随机选择相同数量的标记帧。接下来,我们研究一种使用主观数据(例如情感计算)以及每个图像可以获取多个众包标签的方法,例如在情感计算中进行主观学习。使用儿童情感面部表达(CAFE)数据集,我们模拟了一个积极的学习过程,用于众包许多标签,并发现使用众包标签分布的熵对帧进行优先级排序,从而导致较低的分类跨透镜损失与随机框架选择相比。总的来说,这些结果证明了对嘈杂环境中收集的两种新型主观情感数据的新型主动学习方法的试验评估。
Some of the most severe bottlenecks preventing widespread development of machine learning models for human behavior include a dearth of labeled training data and difficulty of acquiring high quality labels. Active learning is a paradigm for using algorithms to computationally select a useful subset of data points to label using metrics for model uncertainty and data similarity. We explore active learning for naturalistic computer vision emotion data, a particularly heterogeneous and complex data space due to inherently subjective labels. Using frames collected from gameplay acquired from a therapeutic smartphone game for children with autism, we run a simulation of active learning using gameplay prompts as metadata to aid in the active learning process. We find that active learning using information generated during gameplay slightly outperforms random selection of the same number of labeled frames. We next investigate a method to conduct active learning with subjective data, such as in affective computing, and where multiple crowdsourced labels can be acquired for each image. Using the Child Affective Facial Expression (CAFE) dataset, we simulate an active learning process for crowdsourcing many labels and find that prioritizing frames using the entropy of the crowdsourced label distribution results in lower categorical cross-entropy loss compared to random frame selection. Collectively, these results demonstrate pilot evaluations of two novel active learning approaches for subjective affective data collected in noisy settings.