论文标题
我的老师认为世界很平坦!解释自动论文评分机制
My Teacher Thinks The World Is Flat! Interpreting Automatic Essay Scoring Mechanism
论文作者
论文摘要
在过去的二十年中,基于深度学习的自动论文评分(AES)系统已经取得了重大进展。但是,很少进行研究来理解和解释这些基于深度学习的评分模型的黑匣子性质。最近的工作表明,自动评分系统甚至容易发生常识的对抗样本。他们缺乏自然语言理解能力引发了关于数百万候选人为改变生活的决定而积极使用的模型的问题。由于评分是一项高度多模式的任务,因此必须对所有这些模式进行验证和测试评分模型。我们利用最新的解释性进步来找到一致性,内容和相关性等特征对于自动评分机制以及为什么它们容易受到对抗样本的影响很重要。我们发现,经过测试的系统将论文视为具有自然语音和语法结构特征的散文,而是“单词组”,其中几句话比其他单词更重要。删除围绕这些重要单词的背景会导致散文失去语音和语法的流程,但是对预测的分数几乎没有影响。我们还发现,由于模型并非以世界知识和常识为基础,因此添加诸如``世界平坦''之类的虚假事实实际上会增加分数而不是减少分数。
Significant progress has been made in deep-learning based Automatic Essay Scoring (AES) systems in the past two decades. However, little research has been put to understand and interpret the black-box nature of these deep-learning based scoring models. Recent work shows that automated scoring systems are prone to even common-sense adversarial samples. Their lack of natural language understanding capability raises questions on the models being actively used by millions of candidates for life-changing decisions. With scoring being a highly multi-modal task, it becomes imperative for scoring models to be validated and tested on all these modalities. We utilize recent advances in interpretability to find the extent to which features such as coherence, content and relevance are important for automated scoring mechanisms and why they are susceptible to adversarial samples. We find that the systems tested consider essays not as a piece of prose having the characteristics of natural flow of speech and grammatical structure, but as `word-soups' where a few words are much more important than the other words. Removing the context surrounding those few important words causes the prose to lose the flow of speech and grammar, however has little impact on the predicted score. We also find that since the models are not semantically grounded with world-knowledge and common sense, adding false facts such as ``the world is flat'' actually increases the score instead of decreasing it.