论文标题

互动重型作为改进单词嵌入的技术

Interactive Re-Fitting as a Technique for Improving Word Embeddings

论文作者

Powell, James, Sentz, Kari

论文摘要

单词嵌入是从单词共发生中学到的语料库中单词上下文的固定的分布表示。尽管事实证明单词嵌入在自然语言处理任务中具有许多实际用途,但它们反映了受过训练的语料库的属性。最近的工作表明,单词嵌入的后处理以应用词汇词典中发现的信息可以提高其质量。我们通过使其交互式互动为基于这种后处理技术。我们的方法使人类可以通过将一组单词彼此接近,调整单词嵌入空间的部分。该功能的一种激励用例是使用户能够识别和减少单词嵌入中的偏见。我们的方法使用户可以在与单词嵌入中的潜在偏见时触发选择性后处理。

Word embeddings are a fixed, distributional representation of the context of words in a corpus learned from word co-occurrences. While word embeddings have proven to have many practical uses in natural language processing tasks, they reflect the attributes of the corpus upon which they are trained. Recent work has demonstrated that post-processing of word embeddings to apply information found in lexical dictionaries can improve their quality. We build on this post-processing technique by making it interactive. Our approach makes it possible for humans to adjust portions of a word embedding space by moving sets of words closer to one another. One motivating use case for this capability is to enable users to identify and reduce the presence of bias in word embeddings. Our approach allows users to trigger selective post-processing as they interact with and assess potential bias in word embeddings.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源