通过随机数据中心估计单模型不确定性

论文标题

通过随机数据中心估计单模型不确定性

Single Model Uncertainty Estimation via Stochastic Data Centering

论文作者

Thiagarajan, Jayaraman J., Anirudh, Rushil, Narayanaswamy, Vivek, Bremer, Peer-Timo

论文摘要

我们有兴趣估计深神经网络的不确定性，这些神经网络在许多科学和工程问题中起着重要作用。在本文中，我们提出了一个惊人的新发现，即具有相同权重初始化的神经网络的集合，在数据集中受到训练，而数据集则因恒定偏见而移动，这会导致训练有素的模型略有不一致，其中预测的差异是认知不确定性的强烈指标。使用神经切线核（NTK），我们证明了这种现象是由于NTK不变的部分而发生的。由于这是通过微不足道的输入转换来实现的，因此我们表明，可以通过训练单个神经网络（使用我们称为$δ-$ uq的技术）来近似这种行为，从而估计预测围绕预测的不确定性，从而使推理过程中偏见的效果边缘化。我们表明，$δ-$ UQ的不确定性估计值优于各种基准测试的当前方法 - 异常排斥，分配偏移下的校准以及黑匣子功能的顺序设计优化。可以在https://github.com/llnl/deltauq访问$δ-$ uq的代码

We are interested in estimating the uncertainties of deep neural networks, which play an important role in many scientific and engineering problems. In this paper, we present a striking new finding that an ensemble of neural networks with the same weight initialization, trained on datasets that are shifted by a constant bias gives rise to slightly inconsistent trained models, where the differences in predictions are a strong indicator of epistemic uncertainties. Using the neural tangent kernel (NTK), we demonstrate that this phenomena occurs in part because the NTK is not shift-invariant. Since this is achieved via a trivial input transformation, we show that this behavior can therefore be approximated by training a single neural network -- using a technique that we call $Δ-$UQ -- that estimates uncertainty around prediction by marginalizing out the effect of the biases during inference. We show that $Δ-$UQ's uncertainty estimates are superior to many of the current methods on a variety of benchmarks -- outlier rejection, calibration under distribution shift, and sequential design optimization of black box functions. Code for $Δ-$UQ can be accessed at https://github.com/LLNL/DeltaUQ

下载PDF全文

下载文献需遵守相关版权规定

论文标题