增强归因鲁棒性的正规化器

论文标题

增强归因鲁棒性的正规化器

Enhanced Regularizers for Attributional Robustness

论文作者

Sarkar, Anindya, Sarkar, Anirban, Balasubramanian, Vineeth N

论文摘要

深度神经网络是用于计算机视觉任务的学习模型的默认选择。近年来，在解释诸如分类之类的视觉任务的深层模型上进行了广泛的工作。但是，最近的工作表明，即使给出了两个非常相似的图像，这些模型也可以产生实质上不同的归因地图，从而引发了有关可信度的严重问题。为了解决这个问题，我们提出了一种强大的归因培训策略，以改善深神经网络的鲁棒性。我们的方法仔细分析了归因鲁棒性的要求，并介绍了两个新的正规化器，这些正规化器在攻击过程中保留模型的归属图。我们的方法超过了最新的属性鲁棒性方法，在包括MNIST，FMNIST，FLOWLES，FLOWER和GTSRB在内的多个数据集上的归因鲁棒性测量方面约为3％至9％。

Deep neural networks are the default choice of learning models for computer vision tasks. Extensive work has been carried out in recent years on explaining deep models for vision tasks such as classification. However, recent work has shown that it is possible for these models to produce substantially different attribution maps even when two very similar images are given to the network, raising serious questions about trustworthiness. To address this issue, we propose a robust attribution training strategy to improve attributional robustness of deep neural networks. Our method carefully analyzes the requirements for attributional robustness and introduces two new regularizers that preserve a model's attribution map during attacks. Our method surpasses state-of-the-art attributional robustness methods by a margin of approximately 3% to 9% in terms of attribution robustness measures on several datasets including MNIST, FMNIST, Flower and GTSRB.

下载PDF全文

下载文献需遵守相关版权规定

论文标题