论文标题
关于平均野外状态中多层神经网络全球融合的注释
A Note on the Global Convergence of Multilayer Neural Networks in the Mean Field Regime
论文作者
论文摘要
在最近的一项工作中,我们引入了一个严格的框架,以根据神经元嵌入的概念来描述多层神经网络的基于梯度的学习动力学的平均界限。在那里,我们还证明了使用此框架的三层(以及两层)网络的全球收敛保证。 在此同伴注释中,我们指出,可以很容易地扩展我们以前的工作中的见解,以证明对任何深度的多层网络的全球收敛保证。与我们以前的三层全球融合不同,假设I.I.D.初始化,我们目前的结果适用于一种相关的初始化。该初始化允许在任何有限的训练时间,都通过神经网络深度传播某个通用近似属性。为了达到这种效果,我们引入了双向多样性条件。
In a recent work, we introduced a rigorous framework to describe the mean field limit of the gradient-based learning dynamics of multilayer neural networks, based on the idea of a neuronal embedding. There we also proved a global convergence guarantee for three-layer (as well as two-layer) networks using this framework. In this companion note, we point out that the insights in our previous work can be readily extended to prove a global convergence guarantee for multilayer networks of any depths. Unlike our previous three-layer global convergence guarantee that assumes i.i.d. initializations, our present result applies to a type of correlated initialization. This initialization allows to, at any finite training time, propagate a certain universal approximation property through the depth of the neural network. To achieve this effect, we introduce a bidirectional diversity condition.