论文标题
Semeval-2020任务1:使用手套矢量初始化用于无监督的词语语义变化检测
GloVeInit at SemEval-2020 Task 1: Using GloVe Vector Initialization for Unsupervised Lexical Semantic Change Detection
论文作者
论文摘要
本文介绍了Semeval2020任务的矢量初始化方法1:无监督的词汇语义变化检测。给定两个属于不同时间段和一组目标词的语料库,此任务要求我们分类一个单词是否随着时间的流逝而获得还是失去一种感觉(子任务1),并根据其单词感官的变化(subtask 2)对它们进行排名。所提出的方法基于使用向量初始化方法来对齐手套的嵌入。这个想法是连续训练两个语料库的手套嵌入,同时使用第一个模型来初始化第二个模型。本文基于以下假设:与SGN嵌入相比,手套嵌入更适合矢量初始化方法。它提出了这一假设背后的直观推理,还谈到了各种因素和超参数对拟议方法的性能的影响。我们的模型在两个子任务中的33支球队中排名第13和第十。该实施已公开共享。
This paper presents a vector initialization approach for the SemEval2020 Task 1: Unsupervised Lexical Semantic Change Detection. Given two corpora belonging to different time periods and a set of target words, this task requires us to classify whether a word gained or lost a sense over time (subtask 1) and to rank them on the basis of the changes in their word senses (subtask 2). The proposed approach is based on using Vector Initialization method to align GloVe embeddings. The idea is to consecutively train GloVe embeddings for both corpora, while using the first model to initialize the second one. This paper is based on the hypothesis that GloVe embeddings are more suited for the Vector Initialization method than SGNS embeddings. It presents an intuitive reasoning behind this hypothesis, and also talks about the impact of various factors and hyperparameters on the performance of the proposed approach. Our model ranks 13th and 10th among 33 teams in the two subtasks. The implementation has been shared publicly.