通过预测输入最小化来探测模型信号意识

论文标题

通过预测输入最小化来探测模型信号意识

Probing Model Signal-Awareness via Prediction-Preserving Input Minimization

论文作者

Suneja, Sahil, Zheng, Yunhui, Zhuang, Yufan, Laredo, Jim, Morari, Alessandro

论文摘要

这项工作探讨了AI模型的信号意识，以了解源代码的理解。使用软件漏洞检测用例，我们评估了模型捕获正确漏洞信号以产生其预测的能力。我们的预测提供的输入最小化（P2IM）方法系统地将原始源代码减少到最小段，模型需要维护其预测。然后，当模型在最小段中缺少原始代码中的漏洞时，该模型对错误信号的依赖会发现，但是，该模型都预测这两个模型都是脆弱的。我们使用我们提出的信号感知召回（SAR）的新指标来衡量模型的信号意识。我们将P2IM应用于多个数据集的三个不同神经网络架构。结果表明，新的指标从高90年代到60年代的召回率急剧下降，强调这些模型大概在学习其脆弱性检测逻辑时抓住了很多噪音或数据集差异。尽管模型性能下降可能被认为是对抗性攻击，但这不是P2IM的目标。这个想法是通过受控查询以数据驱动的方式揭示黑框模型的信号意识。 SAR的目的是衡量任务不足的模型培训的影响，而不是暗示召回指标的缺点。实际上，期望在模型真正捕获特定于任务的信号的理想情况下，SAR可以匹配召回。

This work explores the signal awareness of AI models for source code understanding. Using a software vulnerability detection use case, we evaluate the models' ability to capture the correct vulnerability signals to produce their predictions. Our prediction-preserving input minimization (P2IM) approach systematically reduces the original source code to a minimal snippet which a model needs to maintain its prediction. The model's reliance on incorrect signals is then uncovered when the vulnerability in the original code is missing in the minimal snippet, both of which the model however predicts as being vulnerable. We measure the signal awareness of models using a new metric we propose- Signal-aware Recall (SAR). We apply P2IM on three different neural network architectures across multiple datasets. The results show a sharp drop in the model's Recall from the high 90s to sub-60s with the new metric, highlighting that the models are presumably picking up a lot of noise or dataset nuances while learning their vulnerability detection logic. Although the drop in model performance may be perceived as an adversarial attack, but this isn't P2IM's objective. The idea is rather to uncover the signal-awareness of a black-box model in a data-driven manner via controlled queries. SAR's purpose is to measure the impact of task-agnostic model training, and not to suggest a shortcoming in the Recall metric. The expectation, in fact, is for SAR to match Recall in the ideal scenario where the model truly captures task-specific signals.

下载PDF全文

下载文献需遵守相关版权规定

论文标题