论文标题
高维中随机特征矩阵的浓度
Concentration of Random Feature Matrices in High-Dimensions
论文作者
论文摘要
随机特征矩阵的光谱提供了有关在随机特征回归问题中使用的线性系统条件的基本信息,因此与随机特征模型的一致性和概括相连。随机特征矩阵是不对称的矩形非线性矩阵,具体取决于两个输入变量,数据和权重,这可能会使它们的表征具有挑战性。我们考虑两个输入变量的两个设置,两个都是随机变量,或一个是一个随机变量,另一个是一个随机变量,而另一个是良好的分离,即点之间的距离最小。随着尺寸,复杂性比和采样方差的条件,我们表明这些矩阵的奇异值集中在其完全期望值附近,并且具有高概率。特别是,由于尺寸仅取决于随机权重或数据点数量的对数,因此即使在许多实用环境中,我们的复杂性界限也可以达到中等范围。通过数值实验验证了理论结果。
The spectra of random feature matrices provide essential information on the conditioning of the linear system used in random feature regression problems and are thus connected to the consistency and generalization of random feature models. Random feature matrices are asymmetric rectangular nonlinear matrices depending on two input variables, the data and the weights, which can make their characterization challenging. We consider two settings for the two input variables, either both are random variables or one is a random variable and the other is well-separated, i.e. there is a minimum distance between points. With conditions on the dimension, the complexity ratio, and the sampling variance, we show that the singular values of these matrices concentrate near their full expectation and near one with high-probability. In particular, since the dimension depends only on the logarithm of the number of random weights or the number of data points, our complexity bounds can be achieved even in moderate dimensions for many practical setting. The theoretical results are verified with numerical experiments.