论文标题

Kullback-Leibler最小化稀疏的Cholesky分解

Sparse Cholesky factorization by Kullback-Leibler minimization

论文作者

Schäfer, Florian, Katzfuss, Matthias, Owhadi, Houman

论文摘要

我们建议通过最大程度地减少高斯分布之间的kullback-leibler差异$ \ mathcal {n}(n}(n}(n}(0,θ)$和$ nathcal $ \ nathcal {n}(0,l^n}(0,l^l^l^n}),稀疏约束。令人惊讶的是,此问题具有可以有效计算的封闭式解决方案,从而在空间统计中恢复了流行的vecchia近似值。根据最新结果,从对格林对椭圆形边界值问题的函数$ \ \ \ {x_ {x_ {i} \} _ {1 \ leq i \ leq i \ leq i \ leq n} \ subset \ subset \ subsbb {rimation spristion中获得的椭圆形边界值问题的功能获得的$θ$的差异大致稀疏性均可要计算此类$θ$计算复杂性的$θ$的因素$ \ to $ \ε$ \ε$ \ m mathcal {o}(n \ log(n/ε)^d)$ in Space和$ \ Mathcal {o}(n \ log log(n/ε)^{2D})$。据我们所知,这是这类问题的最佳渐近复杂性。此外,我们的方法是令人尴尬的平行,可以自动利用数据中的低维结构,并且可以在线性(以$ n $)空间复杂性中执行高斯过程回归。在我们方法的最佳属性中,我们提出了将其应用于高斯过程回归中训练和预测点的联合协方差的方法,从而大大提高了稳定性和计算成本。最后,我们展示了如何将我们的方法应用于具有加性噪声的高斯过程的重要设置,既不牺牲准确性也不牺牲计算复杂性。

We propose to compute a sparse approximate inverse Cholesky factor $L$ of a dense covariance matrix $Θ$ by minimizing the Kullback-Leibler divergence between the Gaussian distributions $\mathcal{N}(0, Θ)$ and $\mathcal{N}(0, L^{-\top} L^{-1})$, subject to a sparsity constraint. Surprisingly, this problem has a closed-form solution that can be computed efficiently, recovering the popular Vecchia approximation in spatial statistics. Based on recent results on the approximate sparsity of inverse Cholesky factors of $Θ$ obtained from pairwise evaluation of Green's functions of elliptic boundary-value problems at points $\{x_{i}\}_{1 \leq i \leq N} \subset \mathbb{R}^{d}$, we propose an elimination ordering and sparsity pattern that allows us to compute $ε$-approximate inverse Cholesky factors of such $Θ$ in computational complexity $\mathcal{O}(N \log(N/ε)^d)$ in space and $\mathcal{O}(N \log(N/ε)^{2d})$ in time. To the best of our knowledge, this is the best asymptotic complexity for this class of problems. Furthermore, our method is embarrassingly parallel, automatically exploits low-dimensional structure in the data, and can perform Gaussian-process regression in linear (in $N$) space complexity. Motivated by the optimality properties of our methods, we propose methods for applying it to the joint covariance of training and prediction points in Gaussian-process regression, greatly improving stability and computational cost. Finally, we show how to apply our method to the important setting of Gaussian processes with additive noise, sacrificing neither accuracy nor computational complexity.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源