论文标题
在Twitter上分析令人遗憾的通信:描述已删除的推文及其作者
Analyzing Regrettable Communications on Twitter: Characterizing Deleted Tweets and Their Authors
论文作者
论文摘要
每天在Twitter中发布了超过5亿条推文,其中大约有11%的推文被张贴的用户删除。这种广泛删除推文的现象导致了许多问题:用户发布的内容使他们以后想删除它们? %的所有用户都同样活跃地删除其推文,还是某些倾向的用户更有可能发布令人遗憾的推文,以后将其删除?在本文中,我们提供了发布的推文的详细表征,然后由其作者删除。在四个星期的时间里,我们收集了200,000多个Twitter用户的推文。我们的表征显示了删除推文的用户与没有删除推文的用户之间的显着性格差异。我们发现,删除推文的用户更有可能是外向和神经质的,同时又不认真。另外,我们发现删除的推文同时包含较少的信息和较少的对话,其中包含令人遗憾的内容的重要迹象。由于在线交流的用户没有立即的社交线索(例如听众的肢体语言)来评估单词的影响,因此他们通常会延迟采用维修策略。最后,我们构建了一个分类器,该分类器采用文本,上下文以及用户功能,以预测是否会删除推文。分类器的F1得分为0.78,当我们考虑推文的响应特征时,精度会提高。
Over 500 million tweets are posted in Twitter each day, out of which about 11% tweets are deleted by the users posting them. This phenomenon of widespread deletion of tweets leads to a number of questions: what kind of content posted by users makes them want to delete them later? %Are all users equally active in deleting their tweets or Are users of certain predispositions more likely to post regrettable tweets, deleting them later? In this paper we provide a detailed characterization of tweets posted and then later deleted by their authors. We collected tweets from over 200 thousand Twitter users during a period of four weeks. Our characterization shows significant personality differences between users who delete their tweets and those who do not. We find that users who delete their tweets are more likely to be extroverted and neurotic while being less conscientious. Also, we find that deleted tweets while containing less information and being less conversational, contain significant indications of regrettable content. Since users of online communication do not have instant social cues (like listener's body language) to gauge the impact of their words, they are often delayed in employing repair strategies. Finally, we build a classifier which takes textual, contextual, as well as user features to predict if a tweet will be deleted or not. The classifier achieves a F1-score of 0.78 and the precision increases when we consider response features of the tweets.