高大数据的Bézier高斯流程

论文标题

高大数据的Bézier高斯流程

Bézier Gaussian Processes for Tall and Wide Data

论文作者

Jørgensen, Martin, Osborne, Michael A.

论文摘要

对高斯过程的现代近似适合“高数据”，其成本在观测值的数量上延伸得很好，但在``宽数据''上表现不佳，在输入功能的数量方面缩小了很差。也就是说，随着输入功能的数量的增长，良好的预测性能需要汇总变量及其相关成本的数量才能快速增长。我们引入了一个内核，该内核允许汇总变量的数量通过输入功能的数量成倍增长，但在观测数和输入功能的数量中仅需要线性成本。通过引入Bézier支撑，可以实现这种缩放，该底座可以在不计算矩阵倒置或决定因素的情况下进行大致推断。我们表明，我们的内核与高斯流程回归中一些最常用的内核具有非常相似的相似之处，并从经验上证明了内核的扩展能力到高大和宽的数据集。

Modern approximations to Gaussian processes are suitable for "tall data", with a cost that scales well in the number of observations, but under-performs on ``wide data'', scaling poorly in the number of input features. That is, as the number of input features grows, good predictive performance requires the number of summarising variables, and their associated cost, to grow rapidly. We introduce a kernel that allows the number of summarising variables to grow exponentially with the number of input features, but requires only linear cost in both number of observations and input features. This scaling is achieved through our introduction of the Bézier buttress, which allows approximate inference without computing matrix inverses or determinants. We show that our kernel has close similarities to some of the most used kernels in Gaussian process regression, and empirically demonstrate the kernel's ability to scale to both tall and wide datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题