论文标题
早期发现具有语义相似性的波斯Twitter中新兴实体
Early Discovery of Emerging Entities in Persian Twitter with Semantic Similarity
论文作者
论文摘要
发现新兴实体(EES)是在建立之前找到实体的问题。这些实体对于个人,公司和政府可能至关重要。这些实体中的许多可以在社交媒体平台上发现,例如叽叽喳喳。近年来,这些身份一直是学术界和工业研究的景点。与任何机器学习问题类似,数据可用性是此问题的主要挑战之一。本文提出了EETT。这是一种在线聚类方法,可以发现EES而无需在数据集上培训。此外,由于缺乏适当的评估度量,本文使用新的指标来评估结果。结果表明,欧特很有希望,并在建立之前找到了重要的实体。
Discovering emerging entities (EEs) is the problem of finding entities before their establishment. These entities can be critical for individuals, companies, and governments. Many of these entities can be discovered on social media platforms, e.g. Twitter. These identities have been the spot of research in academia and industry in recent years. Similar to any machine learning problem, data availability is one of the major challenges in this problem. This paper proposes EEPT. That is an online clustering method able to discover EEs without any need for training on a dataset. Additionally, due to the lack of a proper evaluation metric, this paper uses a new metric to evaluate the results. The results show that EEPT is promising and finds significant entities before their establishment.