论文标题
使用话语和外部语义信息在波斯语中的一种新颖的情感分析方法
A novel approach to sentiment analysis in Persian using discourse and external semantic information
论文作者
论文摘要
情感分析试图从各种数据(例如文本,音频和视频)中识别,提取和量化情感状态和主观信息。已经提出了许多方法来从近年来用自然语言编写的文件中提取个人的观点。这些方法中的大多数都集中在英语上,而波斯语等资源贫乏语言则遭受缺乏研究工作和语言资源的困扰。由于波斯语中的这一差距,目前的工作是为了引入对波斯语应用的情感分析的新方法。本文中提出的方法是双重的:第一种方法是基于分类器组合,第二种方法基于深层神经网络,该网络受益于单词嵌入向量。两种方法都利用了当地的话语信息和外部知识基础,还涵盖了几个语言问题,例如否定和强化,并辅助不同的粒度级别,即单词,方面,句子,句子,短语和文档级别。为了评估拟议方法的性能,从称为酒店评论的波斯酒店评论中收集了波斯数据集。提出的方法已与基于基准数据集的对应方法进行了比较。与相关工作相比,实验结果批准了所提出的方法的有效性。
Sentiment analysis attempts to identify, extract and quantify affective states and subjective information from various types of data such as text, audio, and video. Many approaches have been proposed to extract the sentiment of individuals from documents written in natural languages in recent years. The majority of these approaches have focused on English, while resource-lean languages such as Persian suffer from the lack of research work and language resources. Due to this gap in Persian, the current work is accomplished to introduce new methods for sentiment analysis which have been applied on Persian. The proposed approach in this paper is two-fold: The first one is based on classifier combination, and the second one is based on deep neural networks which benefits from word embedding vectors. Both approaches takes advantage of local discourse information and external knowledge bases, and also cover several language issues such as negation and intensification, andaddresses different granularity levels, namely word, aspect, sentence, phrase and document-levels. To evaluate the performance of the proposed approach, a Persian dataset is collected from Persian hotel reviews referred as hotel reviews. The proposed approach has been compared to counterpart methods based on the benchmark dataset. The experimental results approve the effectiveness of the proposed approach when compared to related works.