论文标题

SANA:关于阿尔及利亚报纸评论的情感分析

SANA : Sentiment Analysis on Newspapers comments in Algeria

论文作者

Rahab, Hichem, Zitouni, Abdelhafid, Djoudi, Mahieddine

论文摘要

在当今生活中,寻求通过与发生事件的互动来追踪人们的意见。这是一种非常普遍的方法是在报纸网站上发表的文章中的评论,这些网站涉及当代事件。情感分析或意见挖掘是一个新兴的领域,目的是发现背后的现象掩盖了有见识的文本。我们对阿尔及利亚报纸网站上的评论对我们的工作感兴趣。为此,使用了两个语料库SANA和OCA。 Sana Corpus是通过三家阿尔及利亚报纸的评论收集的,并由两名阿尔及利亚阿拉伯语母语者注释,而OCA则是一种免费的情感分析语料库。对于分类,我们采用了支持矢量机,天真的贝叶斯和最肯定的邻居。获得的结果非常有前途,并显示了此类域中茎的不同影响,与其他分类器相比,与其他分类器相比,SVM最主要的邻居也给予了重要的改进。从这项研究中,我们可以观察到专门资源和方法的重要性报纸评论情感分析,我们在未来的工作中期待。

It is very current in today life to seek for tracking the people opinion from their interaction with occurring events. A very common way to do that is comments in articles published in newspapers web sites dealing with contemporary events. Sentiment analysis or opinion mining is an emergent field who is the purpose is finding the behind phenomenon masked in opinionated texts. We are interested in our work by comments in Algerian newspaper websites. For this end, two corpora were used SANA and OCA. SANA corpus is created by collection of comments from three Algerian newspapers, and annotated by two Algerian Arabic native speakers, while OCA is a freely available corpus for sentiment analysis. For the classification we adopt Supports vector machines, naive Bayes and knearest neighbors. Obtained results are very promising and show the different effects of stemming in such domain, also knearest neighbors give important improvement comparing to other classifiers unlike similar works where SVM is the most dominant. From this study we observe the importance of dedicated resources and methods the newspaper comments sentiment analysis which we look forward in future works.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源