论文标题
通过累积相对频率分布提高意见垃圾邮件检测
Improving Opinion Spam Detection by Cumulative Relative Frequency Distribution
论文作者
论文摘要
在过去的几年中,在线评论变得非常重要,因为它们可以影响消费者的购买决定和企业的声誉,因此,编写伪造评论的做法可能会对客户和服务提供商产生严重的影响。已经提出了各种方法来检测在线评论中的意见垃圾邮件,尤其是基于监督分类器。在此贡献中,我们从用于对意见垃圾邮件进行分类的一系列有效功能开始,并通过考虑每个功能的累积相对频率分布来重新设计它们。通过对Yelp.com的实际数据进行的实验评估,我们表明分布功能的使用能够改善分类器的性能。
Over the last years, online reviews became very important since they can influence the purchase decision of consumers and the reputation of businesses, therefore, the practice of writing fake reviews can have severe consequences on customers and service providers. Various approaches have been proposed for detecting opinion spam in online reviews, especially based on supervised classifiers. In this contribution, we start from a set of effective features used for classifying opinion spam and we re-engineered them, by considering the Cumulative Relative Frequency Distribution of each feature. By an experimental evaluation carried out on real data from Yelp.com, we show that the use of the distributional features is able to improve the performances of classifiers.