了解过度参数化的单个索引模型中的隐式正则化

论文标题

了解过度参数化的单个索引模型中的隐式正则化

Understanding Implicit Regularization in Over-Parameterized Single Index Model

论文作者

Fan, Jianqing, Yang, Zhuoran, Yu, Mengxin

论文摘要

在本文中，我们利用过度参数化来设计高维单索引模型的无正则算法，并为诱导的隐式正则化现象提供理论保证。具体而言，我们研究了向量和矩阵单索引模型，其中链路函数是非线性和未知的，信号参数是稀疏矢量或低级别对称矩阵，并且可以重尾响应变量。为了更好地理解隐式正规化而没有过多技术性的作用，我们假设协变量的分布是先验的。对于向量和矩阵设置，我们通过采用分数函数变换和专门为重尾数据设计的强大截断步骤来构建一个过度参数的最小二乘损耗函数。我们建议通过将无正则梯度下降应用于损耗函数来估计真实参数。当初始化接近原点并且步骤尺寸足够小时，我们证明所获得的解决方案在矢量和矩阵情况下都达到了最小值的最佳统计收敛速率。此外，我们的实验结果支持我们的理论发现，还证明我们的方法在经验上比$ \ ell_2 $统计率和可变选择一致性的经验上的经典方法具有明确的正则化。

In this paper, we leverage over-parameterization to design regularization-free algorithms for the high-dimensional single index model and provide theoretical guarantees for the induced implicit regularization phenomenon. Specifically, we study both vector and matrix single index models where the link function is nonlinear and unknown, the signal parameter is either a sparse vector or a low-rank symmetric matrix, and the response variable can be heavy-tailed. To gain a better understanding of the role played by implicit regularization without excess technicality, we assume that the distribution of the covariates is known a priori. For both the vector and matrix settings, we construct an over-parameterized least-squares loss function by employing the score function transform and a robust truncation step designed specifically for heavy-tailed data. We propose to estimate the true parameter by applying regularization-free gradient descent to the loss function. When the initialization is close to the origin and the stepsize is sufficiently small, we prove that the obtained solution achieves minimax optimal statistical rates of convergence in both the vector and matrix cases. In addition, our experimental results support our theoretical findings and also demonstrate that our methods empirically outperform classical methods with explicit regularization in terms of both $\ell_2$-statistical rate and variable selection consistency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题