一种用于控制理由提取简洁性的信息瓶颈方法

论文标题

一种用于控制理由提取简洁性的信息瓶颈方法

An Information Bottleneck Approach for Controlling Conciseness in Rationale Extraction

论文作者

Paranjape, Bhargavi, Joshi, Mandar, Thickstun, John, Hajishirzi, Hannaneh, Zettlemoyer, Luke

论文摘要

复杂语言理解模型的决定可以通过将其输入限制为原始文本的相关子序列来合理化。理由应该尽可能简洁，而不会显着降低任务绩效，但是在实践中，这种平衡可能很难实现。在本文中，我们表明可以通过优化信息瓶颈（IB）目标来更好地管理此权衡。我们完全无监督的方法共同学习了一个解释器，该解释器可以预测稀疏的二进制面具，而不是句子，以及仅考虑提取的理由的终点预测指标。使用IB，我们得出了一个学习目标，该目标允许通过可调的稀疏先验直接控制掩模稀疏度。关于橡皮擦基准任务的实验表明，对于与人类理由的任务绩效和一致性，对规范最小化技术的收益显着。此外，我们发现，在半监督的设置中，使用完整输入的模型缩小了适度的黄金原理（占训练示例的25％）。

Decisions of complex language understanding models can be rationalized by limiting their inputs to a relevant subsequence of the original text. A rationale should be as concise as possible without significantly degrading task performance, but this balance can be difficult to achieve in practice. In this paper, we show that it is possible to better manage this trade-off by optimizing a bound on the Information Bottleneck (IB) objective. Our fully unsupervised approach jointly learns an explainer that predicts sparse binary masks over sentences, and an end-task predictor that considers only the extracted rationale. Using IB, we derive a learning objective that allows direct control of mask sparsity levels through a tunable sparse prior. Experiments on ERASER benchmark tasks demonstrate significant gains over norm-minimization techniques for both task performance and agreement with human rationales. Furthermore, we find that in the semi-supervised setting, a modest amount of gold rationales (25% of training examples) closes the gap with a model that uses the full input.

下载PDF全文

下载文献需遵守相关版权规定

论文标题