高维度中广义线性模型的概括误差

论文标题

高维度中广义线性模型的概括误差

Generalization Error of Generalized Linear Models in High Dimensions

论文作者

Emami, Melikasadat, Sahraee-Ardakan, Mojtaba, Pandit, Parthe, Rangan, Sundeep, Fletcher, Alyson K.

论文摘要

机器学习的核心在于，学到的规则对以前看不见的数据的普遍性问题。尽管基于神经网络的过度参数化模型现在在机器学习应用中无处不在，但我们对它们的概括功能的理解是不完整的。基础学习问题的非跨性别性使这项任务更加困难。我们提供了一个通用框架，以表征具有任意非线性的单层神经网络（即广义线性模型）的渐近概括误差，使其适用于回归以及分类问题。该框架可以分析建模过程中（i）过度参数化和非线性的影响；（ii）在学习过程中选择损失功能，初始化和正常化器。我们的模型还捕获了培训和测试分布之间的不匹配。作为示例，我们分析了一些特殊情况，即线性回归和逻辑回归。我们还能够在广义线性模型中严格和分析地解释\ emph {double descent}现象。

At the heart of machine learning lies the question of generalizability of learned rules over previously unseen data. While over-parameterized models based on neural networks are now ubiquitous in machine learning applications, our understanding of their generalization capabilities is incomplete. This task is made harder by the non-convexity of the underlying learning problems. We provide a general framework to characterize the asymptotic generalization error for single-layer neural networks (i.e., generalized linear models) with arbitrary non-linearities, making it applicable to regression as well as classification problems. This framework enables analyzing the effect of (i) over-parameterization and non-linearity during modeling; and (ii) choices of loss function, initialization, and regularizer during learning. Our model also captures mismatch between training and test distributions. As examples, we analyze a few special cases, namely linear regression and logistic regression. We are also able to rigorously and analytically explain the \emph{double descent} phenomenon in generalized linear models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题