贝叶斯神经网络和深度高斯过程的全球诱导点变异后代

论文标题

贝叶斯神经网络和深度高斯过程的全球诱导点变异后代

Global inducing point variational posteriors for Bayesian neural networks and deep Gaussian processes

论文作者

Ober, Sebastian W., Aitchison, Laurence

论文摘要

我们考虑贝叶斯神经网络中顶层重量的最佳后端，以进行回归，并表明它对较低的重量表现出很强的依赖性。我们适应该结果以在贝叶斯神经网络中所有层的重量上形成相关的近似后部。我们将这种方法扩展到深层高斯过程，从而统一了两个模型类的推断。我们的近似后验使用学习的“全局”诱导点，仅在输入层定义，并通过网络传播以在后续层获得诱导输入。相比之下，从深层高斯工艺文献中标准化的“本地”，诱导点方法优化了各个层的一组单独的诱导输入集，因此不会建模跨层的相关性。我们的方法在CIFAR-10的86.7％上给出了一种差异贝叶斯方法的最先进的贝叶斯方法，而无需增强数据，这与SGMCMC相当，而无需回火，但具有增强数据（Wenzel等人2020年为88％）。

We consider the optimal approximate posterior over the top-layer weights in a Bayesian neural network for regression, and show that it exhibits strong dependencies on the lower-layer weights. We adapt this result to develop a correlated approximate posterior over the weights at all layers in a Bayesian neural network. We extend this approach to deep Gaussian processes, unifying inference in the two model classes. Our approximate posterior uses learned "global" inducing points, which are defined only at the input layer and propagated through the network to obtain inducing inputs at subsequent layers. By contrast, standard, "local", inducing point methods from the deep Gaussian process literature optimise a separate set of inducing inputs at every layer, and thus do not model correlations across layers. Our method gives state-of-the-art performance for a variational Bayesian method, without data augmentation or tempering, on CIFAR-10 of 86.7%, which is comparable to SGMCMC without tempering but with data augmentation (88% in Wenzel et al. 2020).

下载PDF全文

下载文献需遵守相关版权规定

论文标题