论文标题
测量标签和自由文本理由之间的关联
Measuring Association Between Labels and Free-Text Rationales
论文作者
论文摘要
在可解释的NLP中,我们需要忠实的理由,以反映该模型的决策过程。虽然先前的工作着重于提取理由(输入词的一个子集),但我们研究了他们研究较少的对应物:自由文本的自然语言理由。我们证明,在信息摘要样式任务上忠实提取合理化的现有模型的管道并不能可靠地扩展到需要自由文本理性的“推理”任务。我们转向共同预测和合理化的模型,这是一类广泛使用的自由文本合理化的高性能模型,该模型尚未确定,该模型尚未确定。我们将标签理性关联定义为忠诚的必要特性:产生标签和理由的模型的内部机制必须有意义地关联。我们提出了两个测量来测试此属性的测量:鲁棒性等价和特征重要性一致。我们发现,基于T5的最先进的联合模型具有合理化的常识性问题和自然语言推断的属性,表明它们有可能产生忠实的自由文本理性。
In interpretable NLP, we require faithful rationales that reflect the model's decision-making process for an explained instance. While prior work focuses on extractive rationales (a subset of the input words), we investigate their less-studied counterpart: free-text natural language rationales. We demonstrate that pipelines, existing models for faithful extractive rationalization on information-extraction style tasks, do not extend as reliably to "reasoning" tasks requiring free-text rationales. We turn to models that jointly predict and rationalize, a class of widely used high-performance models for free-text rationalization whose faithfulness is not yet established. We define label-rationale association as a necessary property for faithfulness: the internal mechanisms of the model producing the label and the rationale must be meaningfully correlated. We propose two measurements to test this property: robustness equivalence and feature importance agreement. We find that state-of-the-art T5-based joint models exhibit both properties for rationalizing commonsense question-answering and natural language inference, indicating their potential for producing faithful free-text rationales.