傅里叶稀疏杠杆分数和近似内核学习

论文标题

傅里叶稀疏杠杆分数和近似内核学习

Fourier Sparse Leverage Scores and Approximate Kernel Learning

论文作者

Erdélyi, Tamás, Musco, Cameron, Musco, Christopher

论文摘要

我们证明了在高斯和拉普拉斯度量下的傅立叶稀疏功能的杠杆评分上，这是新的明确上限。特别是，我们研究了$ f（x）= \ sum_ {j = 1}^s a_j e^e^e^e^e^e^e^e^e^e^e^e^e^e^yλ_jx} $的$ s $ -s-sparse函数，用于系数$ a_j \ in \ mathbb {c} $和频率$ a_j \ in \ in \ mathbb {c}在各种度量下，界限傅立叶稀疏杠杆得分在近似理论中具有纯粹的数学兴趣，我们的工作扩展了均匀度量的现有结果[ERD17，CP19A]。实际上，我们的界限是由机器学习中的两个重要应用激励的： 1。内核近似。它们产生了一种新的随机傅立叶特征算法，用于近似高斯和库奇（理性二次）内核矩阵。对于低维数据，我们的方法使用几乎最佳的功能，并且其运行时是近似内核矩阵的$ statistics \ dimension $中的多项式。除多项式内核外，这是该属性的第一个“遗忘素描方法”，它解决了[akm+17，akk+20b]的开放问题。 2。积极学习。当数据遵循高斯或拉普拉斯分布时，它们可以用作不均匀的采样分布，以进行健壮的主动学习。使用[AKM+19]的框架，我们为带限制和多键插值以及高斯过程回归提供了基本最佳的结果。这些结果概括了仅适用于均匀分布数据的现有工作。

We prove new explicit upper bounds on the leverage scores of Fourier sparse functions under both the Gaussian and Laplace measures. In particular, we study $s$-sparse functions of the form $f(x) = \sum_{j=1}^s a_j e^{i λ_j x}$ for coefficients $a_j \in \mathbb{C}$ and frequencies $λ_j \in \mathbb{R}$. Bounding Fourier sparse leverage scores under various measures is of pure mathematical interest in approximation theory, and our work extends existing results for the uniform measure [Erd17,CP19a]. Practically, our bounds are motivated by two important applications in machine learning: 1. Kernel Approximation. They yield a new random Fourier features algorithm for approximating Gaussian and Cauchy (rational quadratic) kernel matrices. For low-dimensional data, our method uses a near optimal number of features, and its runtime is polynomial in the $statistical\ dimension$ of the approximated kernel matrix. It is the first "oblivious sketching method" with this property for any kernel besides the polynomial kernel, resolving an open question of [AKM+17,AKK+20b]. 2. Active Learning. They can be used as non-uniform sampling distributions for robust active learning when data follows a Gaussian or Laplace distribution. Using the framework of [AKM+19], we provide essentially optimal results for bandlimited and multiband interpolation, and Gaussian process regression. These results generalize existing work that only applies to uniformly distributed data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题