论文标题
通过噪声注入过度隔离的模型中的显式正则化
Explicit Regularization in Overparametrized Models via Noise Injection
论文作者
论文摘要
在梯度下降中注入噪声具有几个理想的特征,例如平滑和正规化特性。在本文中,我们研究了在计算梯度步骤之前注入噪声的效果。我们证明,基于L1-Norm,L1-Norms或核规范的简单模型的小型扰动可以诱导明确的正则化。但是,当应用于具有较大宽度的过多散热性神经网络时,我们表明相同的扰动会导致方差爆炸。为了克服这一点,我们建议使用独立的层扰动,事实证明,该扰动可以在没有方差爆炸的情况下进行明确的正则化。我们的经验结果表明,与香草梯度下降相比,这些小的扰动导致概括性能的提高。
Injecting noise within gradient descent has several desirable features, such as smoothing and regularizing properties. In this paper, we investigate the effects of injecting noise before computing a gradient step. We demonstrate that small perturbations can induce explicit regularization for simple models based on the L1-norm, group L1-norms, or nuclear norms. However, when applied to overparametrized neural networks with large widths, we show that the same perturbations can cause variance explosion. To overcome this, we propose using independent layer-wise perturbations, which provably allow for explicit regularization without variance explosion. Our empirical results show that these small perturbations lead to improved generalization performance compared to vanilla gradient descent.