通过模拟评估单词嵌入的语义互动

论文标题

通过模拟评估单词嵌入的语义互动

Evaluating Semantic Interaction on Word Embeddings via Simulation

论文作者

Bian, Yali, Dowling, Michelle, North, Chris

论文摘要

语义互动（SI）试图学习用户在感官活动过程中直接操纵数据预测时的认知意图。为了进行文本分析，SI的事先实现使用了常见的数据功能，例如字袋表示，用于从用户交互中学习。取而代之的是，我们假设从深度学习单词嵌入中得出的功能将使SI能够更好地捕获用户的微妙意图。但是，评估这些影响很困难。通常通过观察最终用户应用程序的效用和有效性来评估SI系统。这种方法在可复制性，可伸缩性和客观性方面存在缺点，这使得很难在不同的SI模型之间进行令人信服的对比实验。为了解决这个问题，我们通过模拟用户的交互并计算学习模型的准确性来探索以定量算法为中心的分析作为互补评估方法。我们使用这些方法将单词插头与SI的单词袋功能进行比较。

Semantic interaction (SI) attempts to learn the user's cognitive intents as they directly manipulate data projections during sensemaking activity. For text analysis, prior implementations of SI have used common data features, such as bag-of-words representations, for machine learning from user interactions. Instead, we hypothesize that features derived from deep learning word embeddings will enable SI to better capture the user's subtle intents. However, evaluating these effects is difficult. SI systems are usually evaluated by a human-centred qualitative approach, by observing the utility and effectiveness of the application for end-users. This approach has drawbacks in terms of replicability, scalability, and objectiveness, which makes it hard to perform convincing contrast experiments between different SI models. To tackle this problem, we explore a quantitative algorithm-centered analysis as a complementary evaluation approach, by simulating users' interactions and calculating the accuracy of the learned model. We use these methods to compare word-embeddings to bag-of-words features for SI.

下载PDF全文

下载文献需遵守相关版权规定

论文标题