基于内核的剩余网络平滑度分析

论文标题

基于内核的剩余网络平滑度分析

Kernel-Based Smoothness Analysis of Residual Networks

论文作者

Tirer, Tom, Bruna, Joan, Giryes, Raja

论文摘要

深度神经网络成功的主要因素是使用复杂的体系结构，而不是经典的多层感知器（MLP）。残留网络（重新NET）在这些强大的现代体系结构中脱颖而出。以前的作品着重于深度MLP的深度重新设备的优化优势。在本文中，我们显示了这两个模型之间的另一个区别，即，与MLP相比，重新设备促进更光滑的插值的趋势。我们通过神经切线内核（NTK）方法分析了这种现象。首先，我们计算NTK以进行考虑的重新连接模型，并在梯度下降训练中证明其稳定性。然后，我们通过各种评估方法显示，对于Relu激活，Resnet的NTK及其内核回归结果比MLP的RESNET结果更光滑。在我们的分析中观察到的更好的平滑度可以解释重新结构的更好的概括能力以及中度减弱残留块的实践。

A major factor in the success of deep neural networks is the use of sophisticated architectures rather than the classical multilayer perceptron (MLP). Residual networks (ResNets) stand out among these powerful modern architectures. Previous works focused on the optimization advantages of deep ResNets over deep MLPs. In this paper, we show another distinction between the two models, namely, a tendency of ResNets to promote smoother interpolations than MLPs. We analyze this phenomenon via the neural tangent kernel (NTK) approach. First, we compute the NTK for a considered ResNet model and prove its stability during gradient descent training. Then, we show by various evaluation methodologies that for ReLU activations the NTK of ResNet, and its kernel regression results, are smoother than the ones of MLP. The better smoothness observed in our analysis may explain the better generalization ability of ResNets and the practice of moderately attenuating the residual blocks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题