通过文本生成解释问题回答模型

论文标题

通过文本生成解释问题回答模型

Explaining Question Answering Models through Text Generation

论文作者

Latcinnik, Veronica, Berant, Jonathan

论文摘要

大型预训练的语言模型（LMS）已被证明在需要常识性和世界知识的任务进行微调时表现出色。但是，在端到端体系结构中，很难解释LM中允许其正确预测的知识。在这项工作中，我们提出了一个用于多选问题回答的模型，其中基于LM的生成器生成了文本假设，后来分类器将其用于回答该问题。该假设为人类可以检查的微型LM使用的信息提供了一个窗口。该设置中的一个主要挑战是如何限制模型以生成对人类有意义的假设。我们通过（a）与简单的相似性分类器进行联合培训来解决这一问题，从而鼓励有意义的假设，以及（b）添加损失功能，鼓励自然文本而无需重复。我们在几个任务上表明，我们的模型达到的性能与端到端体系结构相当，同时产生假设，阐明了LM用于回答问题的知识。

Large pre-trained language models (LMs) have been shown to perform surprisingly well when fine-tuned on tasks that require commonsense and world knowledge. However, in end-to-end architectures, it is difficult to explain what is the knowledge in the LM that allows it to make a correct prediction. In this work, we propose a model for multi-choice question answering, where a LM-based generator generates a textual hypothesis that is later used by a classifier to answer the question. The hypothesis provides a window into the information used by the fine-tuned LM that can be inspected by humans. A key challenge in this setup is how to constrain the model to generate hypotheses that are meaningful to humans. We tackle this by (a) joint training with a simple similarity classifier that encourages meaningful hypotheses, and (b) by adding loss functions that encourage natural text without repetitions. We show on several tasks that our model reaches performance that is comparable to end-to-end architectures, while producing hypotheses that elucidate the knowledge used by the LM for answering the question.

下载PDF全文

下载文献需遵守相关版权规定

论文标题