论文标题

密集的嵌入在WordNet中保存语义关系

Dense Embeddings Preserving the Semantic Relationships in WordNet

论文作者

Zhang, Canlin, Liu, Xiuwen

论文摘要

在本文中,我们提供了一种新颖的方法,以生成WordNet中名词和动词Synsets的低维矢量嵌入,其中嵌入中保留了高态 - 主词的关系。我们称之为嵌入感官频谱(以及嵌入的感觉频谱)。为了创建适合训练感觉光谱的标签,我们为WordNet中的名词和动词Synset设计了一个新的相似性测量。我们称这种相似性测量值高nym交叉点相似性(HIS),因为它比较了两个合成器之间的常见和独特的高呼气。我们的实验表明,在Simlex-999数据集的名词和动词对上,他的表现优于WordNet中的三个相似性测量值。此外,据我们所知,感觉频谱提供了保留WordNet中语义关系的第一个密集的同步嵌入。

In this paper, we provide a novel way to generate low dimensional vector embeddings for the noun and verb synsets in WordNet, where the hypernym-hyponym relationship is preserved in the embeddings. We call this embedding the Sense Spectrum (and Sense Spectra for embeddings). In order to create suitable labels for the training of sense spectra, we designed a new similarity measurement for noun and verb synsets in WordNet. We call this similarity measurement the Hypernym Intersection Similarity (HIS), since it compares the common and unique hypernyms between two synsets. Our experiments show that on the noun and verb pairs of the SimLex-999 dataset, HIS outperforms the three similarity measurements in WordNet. Moreover, to the best of our knowledge, the sense spectra provide the first dense synset embeddings that preserve the semantic relationships in WordNet.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源