论文标题
一项有关测量和缓解机器阅读理解中推理快捷方式的调查
A Survey on Measuring and Mitigating Reasoning Shortcuts in Machine Reading Comprehension
论文作者
论文摘要
快捷方式学习的问题在NLP中广为人知,并且近年来一直是重要的研究重点。数据中的意外相关性使模型能够轻松求解旨在表现出高级语言理解和推理能力的任务。在本调查论文中,我们关注机器阅读理解(MRC)的领域,这是展示高级语言理解的重要任务,这也遭受了一系列快捷方式。我们总结了用于测量和减轻快捷方式的可用技术,并以捷径研究进一步进展的建议结论。重要的是,我们重点介绍了MRC中缓解快捷方式的两个问题:(1)缺乏公共挑战集,有效和可重复使用的评估的必要组成部分,以及(2)缺乏在其他领域突出的某些缓解技术。
The issue of shortcut learning is widely known in NLP and has been an important research focus in recent years. Unintended correlations in the data enable models to easily solve tasks that were meant to exhibit advanced language understanding and reasoning capabilities. In this survey paper, we focus on the field of machine reading comprehension (MRC), an important task for showcasing high-level language understanding that also suffers from a range of shortcuts. We summarize the available techniques for measuring and mitigating shortcuts and conclude with suggestions for further progress in shortcut research. Importantly, we highlight two concerns for shortcut mitigation in MRC: (1) the lack of public challenge sets, a necessary component for effective and reusable evaluation, and (2) the lack of certain mitigation techniques that are prominent in other areas.