论文标题
GM-CTSC在Semeval-2020任务1:高斯混合物交叉时空相似性聚类
GM-CTSC at SemEval-2020 Task 1: Gaussian Mixtures Cross Temporal Similarity Clustering
论文作者
论文摘要
本文介绍了针对Semeval-2020任务提出的系统1:无监督的词汇语义变化检测。我们将方法集中在检测问题上。鉴于在不同时间段被时间单词嵌入捕获的单词的语义,我们研究了使用无监督方法来检测目标词何时获得或松散感官的使用。为此,我们定义了一种基于高斯混合模型的新算法,以聚集在两个时期计算的目标相似性。我们将提出的方法与许多基于相似性的阈值进行了比较。我们发现,尽管检测方法的性能在嵌入算法上有所不同,但高斯混合物与时间参考的组合导致了我们的最佳系统。
This paper describes the system proposed for the SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection. We focused our approach on the detection problem. Given the semantics of words captured by temporal word embeddings in different time periods, we investigate the use of unsupervised methods to detect when the target word has gained or loosed senses. To this end, we defined a new algorithm based on Gaussian Mixture Models to cluster the target similarities computed over the two periods. We compared the proposed approach with a number of similarity-based thresholds. We found that, although the performance of the detection methods varies across the word embedding algorithms, the combination of Gaussian Mixture with Temporal Referencing resulted in our best system.