贝叶斯神经网络重量的分层高斯工艺先验

论文标题

贝叶斯神经网络重量的分层高斯工艺先验

Hierarchical Gaussian Process Priors for Bayesian Neural Network Weights

论文作者

Karaletsos, Theofanis, Bui, Thang D.

论文摘要

概率的神经网络通常是用独立的权重先验建模的，这些先验不会捕获先验中的权重相关性，并且不提供简约的接口以表达功能空间中的属性。理想的先验类别将紧凑地表示权重，捕获权重之间的相关性，促进有关不确定性的校准推理，并允许在功能空间（例如周期性或对输入等上下文（例如输入）的依赖性等方面知识。为此，本文介绍了两项创新：（i）基于高斯过程的基于高斯的层次模型，用于基于单位嵌入的网络权重，可以灵活地编码相关的权重结构，以及（ii）这些权重先验的输入依赖性版本，可以通过在上下文输入上使用定义的kernels来提供方便的方式，以提供方便的方式，以使功能正常。我们显示，这些模型对分布数据的数据提供了理想的测试时间不确定性估计，证明了对具有内核神经网络的归纳偏见的案例，这些核能有助于插值和从训练数据中推断出外推，并在主动学习基准中证明了竞争性的预测性能。

Probabilistic neural networks are typically modeled with independent weight priors, which do not capture weight correlations in the prior and do not provide a parsimonious interface to express properties in function space. A desirable class of priors would represent weights compactly, capture correlations between weights, facilitate calibrated reasoning about uncertainty, and allow inclusion of prior knowledge about the function space such as periodicity or dependence on contexts such as inputs. To this end, this paper introduces two innovations: (i) a Gaussian process-based hierarchical model for network weights based on unit embeddings that can flexibly encode correlated weight structures, and (ii) input-dependent versions of these weight priors that can provide convenient ways to regularize the function space through the use of kernels defined on contextual inputs. We show these models provide desirable test-time uncertainty estimates on out-of-distribution data, demonstrate cases of modeling inductive biases for neural networks with kernels which help both interpolation and extrapolation from training data, and demonstrate competitive predictive performance on an active learning benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题