论文标题
具有虚假功能的正规化权衡取舍
Regularization Trade-offs with Fake Features
论文作者
论文摘要
大规模过度参数模型的最新成功启发了一项新的工作,研究了基本条件,使过度参数化模型可以很好地概括。本文考虑了一个框架,其中可能过度叠加的模型包含虚假功能,即模型中存在但不存在数据中的功能。我们在模型错误指定的具有假特征的模型指定下,呈现出非反应性高概率限制。我们的高探针结果提供了有关假特征提供的隐式正则化与脊参数提供的明确正则化之间的相互作用的见解。数值结果说明了假特征的数量与最佳脊参数如何在很大程度上取决于假特征的数量之间的权衡。
Recent successes of massively overparameterized models have inspired a new line of work investigating the underlying conditions that enable overparameterized models to generalize well. This paper considers a framework where the possibly overparametrized model includes fake features, i.e., features that are present in the model but not in the data. We present a non-asymptotic high-probability bound on the generalization error of the ridge regression problem under the model misspecification of having fake features. Our highprobability results provide insights into the interplay between the implicit regularization provided by the fake features and the explicit regularization provided by the ridge parameter. Numerical results illustrate the trade-off between the number of fake features and how the optimal ridge parameter may heavily depend on the number of fake features.