论文标题
跨写作系统的无监督双语词典感应
Unsupervised Bilingual Lexicon Induction Across Writing Systems
论文作者
论文摘要
在无监督的双语词典诱导中,最近基于嵌入的方法显示出良好的结果,但通常没有杠杆拼字(拼写)信息,这对成对的相关语言有帮助。这项工作增强了具有正交特征的最先进方法,并通过提出可以学习和利用拼写对应的方法,即使在具有不同脚本的语言之间,可以扩展此空间中的先前工作。我们通过在三个语言对上使用不同的脚本和不同程度的词汇相似性进行实验来证明这一点。
Recent embedding-based methods in unsupervised bilingual lexicon induction have shown good results, but generally have not leveraged orthographic (spelling) information, which can be helpful for pairs of related languages. This work augments a state-of-the-art method with orthographic features, and extends prior work in this space by proposing methods that can learn and utilize orthographic correspondences even between languages with different scripts. We demonstrate this by experimenting on three language pairs with different scripts and varying degrees of lexical similarity.