具有最大平均差异正则化的贝叶斯神经网络

论文标题

具有最大平均差异正则化的贝叶斯神经网络

Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

论文作者

Pomponi, Jary, Scardapane, Simone, Uncini, Aurelio

论文摘要

贝叶斯神经网络（BNN）经过训练，可以优化整个重量的分布，而不是单个组合，在例如，可解释性，多任务学习和校准方面具有显着优势。由于最终优化问题的难以理解，因此大多数BNN要么通过蒙特卡洛方法取样，要么通过最大程度地减少在变化近似上的合适证据下限（ELBO）来训练。在本文中，我们提出了后者的一种变体，其中我们用最大的平均差异（MMD）估计器替换了ELBO项中的Kullback-Leibler差异，这是受差异推理的最新工作的启发。在基于MMD术语的属性激发了我们的建议之后，我们开始展示拟议配方比最先进的一些经验优势。特别是，我们的BNN在多个基准测试中获得了更高的精度，包括几个图像分类任务。此外，它们对于选择先前的重量更为强大，并且可以更好地校准。作为第二个贡献，我们提供了一种新的公式，用于估算给定预测的不确定性，表明它以更强大的方式针对对抗性攻击和输入中的噪声注入，与更古典的标准（例如不同的熵）相比。

Bayesian Neural Networks (BNNs) are trained to optimize an entire distribution over their weights instead of a single set, having significant advantages in terms of, e.g., interpretability, multi-task learning, and calibration. Because of the intractability of the resulting optimization problem, most BNNs are either sampled through Monte Carlo methods, or trained by minimizing a suitable Evidence Lower BOund (ELBO) on a variational approximation. In this paper, we propose a variant of the latter, wherein we replace the Kullback-Leibler divergence in the ELBO term with a Maximum Mean Discrepancy (MMD) estimator, inspired by recent work in variational inference. After motivating our proposal based on the properties of the MMD term, we proceed to show a number of empirical advantages of the proposed formulation over the state-of-the-art. In particular, our BNNs achieve higher accuracy on multiple benchmarks, including several image classification tasks. In addition, they are more robust to the selection of a prior over the weights, and they are better calibrated. As a second contribution, we provide a new formulation for estimating the uncertainty on a given prediction, showing it performs in a more robust fashion against adversarial attacks and the injection of noise over their inputs, compared to more classical criteria such as the differential entropy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题