论文标题
基于嵌入的目标特定姿态聚类:极化火鸡的情况
Embeddings-Based Clustering for Target Specific Stances: The Case of a Polarized Turkey
论文作者
论文摘要
2018年6月24日,土耳其举行了高度的选举,土耳其人民在新总统制度下首次选举中选举了总统和议会。在选举期间,土耳其人民在Twitter上广泛分享了他们的政治观点。选民之间两极分化的一个方面是支持或反对连任Recep TayyipErdoğan。在本文中,我们提出了一种无监督的方法,用于在两极分化的环境中,特别是土耳其政治中特定于目标的立场检测,在识别用户立场时达到了90%的精度,同时保持了80%以上的召回。该方法涉及使用基于Google的卷积神经网络(CNN)多语言通用句子编码器来代表嵌入式空间中的用户。然后将表示形式投射到较低的维空间上,以反映相似性并因此被聚集。我们展示了我们方法在正确聚集了包括政治人物,不同群体和政党的多个目标的不同群体的使用者中的有效性。我们对1.08亿土耳其选举相关的推文的大型数据集以及168K土耳其用户的时间表推文进行了分析,他们撰写了2.13亿条推文。鉴于最终的用户立场,我们能够观察主题与计算主题极化之间的相关性。
On June 24, 2018, Turkey conducted a highly consequential election in which the Turkish people elected their president and parliament in the first election under a new presidential system. During the election period, the Turkish people extensively shared their political opinions on Twitter. One aspect of polarization among the electorate was support for or opposition to the reelection of Recep Tayyip Erdoğan. In this paper, we present an unsupervised method for target-specific stance detection in a polarized setting, specifically Turkish politics, achieving 90% precision in identifying user stances, while maintaining more than 80% recall. The method involves representing users in an embedding space using Google's Convolutional Neural Network (CNN) based multilingual universal sentence encoder. The representations are then projected onto a lower dimensional space in a manner that reflects similarities and are consequently clustered. We show the effectiveness of our method in properly clustering users of divergent groups across multiple targets that include political figures, different groups, and parties. We perform our analysis on a large dataset of 108M Turkish election-related tweets along with the timeline tweets of 168k Turkish users, who authored 213M tweets. Given the resultant user stances, we are able to observe correlations between topics and compute topic polarization.