线性层在非线性插值网络中的作用

论文标题

线性层在非线性插值网络中的作用

The Role of Linear Layers in Nonlinear Interpolating Networks

论文作者

Ongie, Greg, Willett, Rebecca

论文摘要

本文探讨了深度大于两层的过度参数化神经网络的隐式偏见。我们的框架考虑了一个不同深度的网络家族，所有这些都具有相同的能力但隐式定义的代表成本。由神经网络架构引起的函数的表示成本是网络代表该功能所需的平方重量的最小总和。它反映了与体系结构相关的功能空间偏差。我们的结果表明，在Relu网络中添加线性层产生的表示成本，反映了Relu单元的比对和稀疏性之间的复杂相互作用。具体而言，使用神经网络以最低表示成本拟合训练数据可产生一个插值功能，该功能在垂直于低维子空间的方向上是恒定的，在该子空间上存在较小的interpolant。

This paper explores the implicit bias of overparameterized neural networks of depth greater than two layers. Our framework considers a family of networks of varying depth that all have the same capacity but different implicitly defined representation costs. The representation cost of a function induced by a neural network architecture is the minimum sum of squared weights needed for the network to represent the function; it reflects the function space bias associated with the architecture. Our results show that adding linear layers to a ReLU network yields a representation cost that reflects a complex interplay between the alignment and sparsity of ReLU units. Specifically, using a neural network to fit training data with minimum representation cost yields an interpolating function that is constant in directions perpendicular to a low-dimensional subspace on which a parsimonious interpolant exists.

下载PDF全文

下载文献需遵守相关版权规定

论文标题