论文标题

基于孟加拉文本的人类异常检测

Human Abnormality Detection Based on Bengali Text

论文作者

Mridha, M. F., Rahman, Md. Saifur, Ohi, Abu Quwsar

论文摘要

在自然语言处理和人类计算机互动领域,人类的态度和情感吸引了研究人员。但是,在人类计算机相互作用的领域中,人类异常检测尚未进行广泛研究,大多数作品都取决于基于图像的信息。在自然语言处理中,有效的含义可以通过所有单词传达。每个单词都会因为与想法或类别的语义联系而引起困难的相遇。在本文中,引入了一种有效而有效的人类异常检测模型,该模型仅使用孟加拉文本。该提出的模型可以通过分析其打字孟加拉语文本来识别该人处于正常状态还是异常状态。据我们所知,这是开发基于文本的人类异常检测系统的首次尝试。我们创建了孟加拉数据集(包含2000个句子),这些句子是由自愿对话生成的。我们通过使用幼稚的贝叶斯和支持向量机作为分类器进行了比较分析。两种不同的特征提取技术计数矢量,而TF-IDF用于在我们构造的数据集中进行实验。在实验中,我们的构造数据集实现了最高89%的精度和92%的F1得分。

In the field of natural language processing and human-computer interaction, human attitudes and sentiments have attracted the researchers. However, in the field of human-computer interaction, human abnormality detection has not been investigated extensively and most works depend on image-based information. In natural language processing, effective meaning can potentially convey by all words. Each word may bring out difficult encounters because of their semantic connection with ideas or categories. In this paper, an efficient and effective human abnormality detection model is introduced, that only uses Bengali text. This proposed model can recognize whether the person is in a normal or abnormal state by analyzing their typed Bengali text. To the best of our knowledge, this is the first attempt in developing a text based human abnormality detection system. We have created our Bengali dataset (contains 2000 sentences) that is generated by voluntary conversations. We have performed the comparative analysis by using Naive Bayes and Support Vector Machine as classifiers. Two different feature extraction techniques count vector, and TF-IDF is used to experiment on our constructed dataset. We have achieved a maximum 89% accuracy and 92% F1-score with our constructed dataset in our experiment.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源