论文标题

通过噪声注入过度隔离的模型中的显式正则化

Explicit Regularization in Overparametrized Models via Noise Injection

论文作者

Orvieto, Antonio, Raj, Anant, Kersting, Hans, Bach, Francis

论文摘要

在梯度下降中注入噪声具有几个理想的特征,例如平滑和正规化特性。在本文中,我们研究了在计算梯度步骤之前注入噪声的效果。我们证明,基于L1-Norm,L1-Norms或核规范的简单模型的小型扰动可以诱导明确的正则化。但是,当应用于具有较大宽度的过多散热性神经网络时,我们表明相同的扰动会导致方差爆炸。为了克服这一点,我们建议使用独立的层扰动,事实证明,该扰动可以在没有方差爆炸的情况下进行明确的正则化。我们的经验结果表明,与香草梯度下降相比,这些小的扰动导致概括性能的提高。

Injecting noise within gradient descent has several desirable features, such as smoothing and regularizing properties. In this paper, we investigate the effects of injecting noise before computing a gradient step. We demonstrate that small perturbations can induce explicit regularization for simple models based on the L1-norm, group L1-norms, or nuclear norms. However, when applied to overparametrized neural networks with large widths, we show that the same perturbations can cause variance explosion. To overcome this, we propose using independent layer-wise perturbations, which provably allow for explicit regularization without variance explosion. Our empirical results show that these small perturbations lead to improved generalization performance compared to vanilla gradient descent.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源