论文标题
在与多层Relu网络相关的Banach空间上:功能表示,近似理论和梯度下降动力学
On the Banach spaces associated with multi-layer ReLU networks: Function representation, approximation theory and gradient descent dynamics
论文作者
论文摘要
我们开发了有限深度$ l $和无限宽度的Relu神经网络的Banach空间。这些空间包含所有有限连接的$ l $ layer网络及其$ l^2 $ - 限制对象在自然路径 - norm上的界限。在此规范下,$ L $ layer网络的空间中的单位球具有较低的Rademacher复杂性,因此具有良好的概括属性。这些空间中的功能可以通过具有无关依赖性收敛速率的多层神经网络近似。 这项工作的关键是一种以多层神经网络的驱动,一种以某种形式表示功能的新方法。这种表示使我们可以定义一类新的用于机器学习的连续模型。我们表明,以这种方式定义的梯度流是相关多层神经网络的梯度下降动力学的自然连续类似物。我们表明,在这种连续的梯度流动动力学下,大多数多项式的路径在大多数上都增加。
We develop Banach spaces for ReLU neural networks of finite depth $L$ and infinite width. The spaces contain all finite fully connected $L$-layer networks and their $L^2$-limiting objects under bounds on the natural path-norm. Under this norm, the unit ball in the space for $L$-layer networks has low Rademacher complexity and thus favorable generalization properties. Functions in these spaces can be approximated by multi-layer neural networks with dimension-independent convergence rates. The key to this work is a new way of representing functions in some form of expectations, motivated by multi-layer neural networks. This representation allows us to define a new class of continuous models for machine learning. We show that the gradient flow defined this way is the natural continuous analog of the gradient descent dynamics for the associated multi-layer neural networks. We show that the path-norm increases at most polynomially under this continuous gradient flow dynamics.