导航神经空间：重新访问概念激活向量以克服定向差异

论文标题

导航神经空间：重新访问概念激活向量以克服定向差异

Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence

论文作者

Pahde, Frederik, Dreyer, Maximilian, Weber, Leander, Weckbecker, Moritz, Anders, Christopher J., Wiegand, Thomas, Samek, Wojciech, Lapuschkin, Sebastian

论文摘要

随着人们对理解神经网络预测策略的兴趣，概念激活媒介（CAV）已成为一种流行的工具，用于建模潜在空间中的人类可靠概念。通常，通过利用线性分类器来优化带有和没有给定概念的样品的潜在表示的可分离性来计算CAVS。但是，在本文中，我们表明，这种面向可分离性的计算导致解决方案，这可能与精确建模概念方向的实际目标不同。这种差异可以归因于干扰物方向的重要影响，即与概念无关的信号，这是由线性模型的过滤器（即权重）拾取的，以优化类别性。为了解决这个问题，我们介绍了基于模式的骑士，仅专注于概念信号，从而提供了更准确的概念方向。我们根据其与真实概念方向的一致性及其对CAV应用的影响，评估各种CAV方法，包括概念灵敏度测试和模型校正对数据伪像引起的快捷行为。我们使用儿科骨时代，ISIC2019和带有VGG，Resnet，Rexnet，Rexnet，EfficityNet和Vision Transflerer作为模型体系结构的数据集证明了基于模式的CAV的好处。

With a growing interest in understanding neural network prediction strategies, Concept Activation Vectors (CAVs) have emerged as a popular tool for modeling human-understandable concepts in the latent space. Commonly, CAVs are computed by leveraging linear classifiers optimizing the separability of latent representations of samples with and without a given concept. However, in this paper we show that such a separability-oriented computation leads to solutions, which may diverge from the actual goal of precisely modeling the concept direction. This discrepancy can be attributed to the significant influence of distractor directions, i.e., signals unrelated to the concept, which are picked up by filters (i.e., weights) of linear models to optimize class-separability. To address this, we introduce pattern-based CAVs, solely focussing on concept signals, thereby providing more accurate concept directions. We evaluate various CAV methods in terms of their alignment with the true concept direction and their impact on CAV applications, including concept sensitivity testing and model correction for shortcut behavior caused by data artifacts. We demonstrate the benefits of pattern-based CAVs using the Pediatric Bone Age, ISIC2019, and FunnyBirds datasets with VGG, ResNet, ReXNet, EfficientNet, and Vision Transformer as model architectures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题