论文标题

多任务深度学习方法,用于新闻微博上的用户抑郁症检测

A Multitask Deep Learning Approach for User Depression Detection on Sina Weibo

论文作者

Wang, Yiding, Wang, Zhenyi, Li, Chenghao, Zhang, Yilin, Wang, Haizhou

论文摘要

近年来,由于抑郁症的精神负担,危及生命的人数一直在迅速增加。在线社交网络(OSN)为研究人员提供了检测患有抑郁症的人的另一种观点。但是,基于机器学习的抑郁症检测研究的研究仍然相对较低,这表明其功能工程有很大的改善潜力。在本文中,我们在Sina微博(中国社区中有活跃用户数量最多的领先OSN)上手动构建了一个大数据集,即Weibo用户抑郁检测数据集(WU3D)。它包括20,000多名普通用户和10,000多名沮丧的用户,这两种用户都由专业人员手动标记和重新检查。通过分析用户的文本,社交行为和已发布的图片,结论并提出了十个统计特征。同时,使用流行的验证模型XLNET提取基于文本的单词功能。此外,提出了一种新型的深神网络分类模型,即FusionNet(FN),并通过上述特征同时训练并同时训练,这些特征被视为多个分类任务。实验结果表明,FusionNet在测试数据集上达到了0.9772的最高F1得分。与现有研究相比,我们提出的方法具有更好的分类性能和不平衡培训样本的鲁棒性。我们的工作还为检测其他OSN平台上的抑郁症提供了一种新的方法。

In recent years, due to the mental burden of depression, the number of people who endanger their lives has been increasing rapidly. The online social network (OSN) provides researchers with another perspective for detecting individuals suffering from depression. However, existing studies of depression detection based on machine learning still leave relatively low classification performance, suggesting that there is significant improvement potential for improvement in their feature engineering. In this paper, we manually build a large dataset on Sina Weibo (a leading OSN with the largest number of active users in the Chinese community), namely Weibo User Depression Detection Dataset (WU3D). It includes more than 20,000 normal users and more than 10,000 depressed users, both of which are manually labeled and rechecked by professionals. By analyzing the user's text, social behavior, and posted pictures, ten statistical features are concluded and proposed. In the meantime, text-based word features are extracted using the popular pretrained model XLNet. Moreover, a novel deep neural network classification model, i.e. FusionNet (FN), is proposed and simultaneously trained with the above-extracted features, which are seen as multiple classification tasks. The experimental results show that FusionNet achieves the highest F1-Score of 0.9772 on the test dataset. Compared to existing studies, our proposed method has better classification performance and robustness for unbalanced training samples. Our work also provides a new way to detect depression on other OSN platforms.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源