论文标题

使忠实的解释与他们的社会归因

Aligning Faithful Interpretations with their Social Attribution

论文作者

Jacovi, Alon, Goldberg, Yoav

论文摘要

我们发现,模型解释忠实的要求是模糊和不完整的。通过文本亮点作为案例研究的解释,我们提出了几个失败案例。借用社会科学的概念,我们确定问题是决定因果关系链(因果归因)与人类行为对解释的归因(社会归因)之间的不一致。我们将忠诚重新构成为因果关系的准确归因,并介绍忠实忠诚的概念:忠实的因果关系,这些因果关系与他们预期的社会行为保持一致。因果归因和社会归因的两个步骤共同完成了解释行为的过程。通过这种形式化,我们表征了忠实忠实的忠实解释的各种失败,并提出了一个替代性因果链来纠正这些问题。最后,我们使用对比解释对拟议的因果格式进行了突出显示。

We find that the requirement of model interpretations to be faithful is vague and incomplete. With interpretation by textual highlights as a case-study, we present several failure cases. Borrowing concepts from social science, we identify that the problem is a misalignment between the causal chain of decisions (causal attribution) and the attribution of human behavior to the interpretation (social attribution). We re-formulate faithfulness as an accurate attribution of causality to the model, and introduce the concept of aligned faithfulness: faithful causal chains that are aligned with their expected social behavior. The two steps of causal attribution and social attribution together complete the process of explaining behavior. With this formalization, we characterize various failures of misaligned faithful highlight interpretations, and propose an alternative causal chain to remedy the issues. Finally, we implement highlight explanations of the proposed causal format using contrastive explanations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源