论文标题

CNN体系结构的隐式凸正则化:在多项式时间内两层和三层网络的凸优化

Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time

论文作者

Ergen, Tolga, Pilanci, Mert

论文摘要

我们研究了具有RELU激活的卷积神经网络(CNN)的培训,并引入了与数据样本数量,神经元数量和数据维度相对于多项式复杂性的确切凸优化公式。更具体地说,我们利用半无限双重性开发了一个凸分析框架,以获得几种两层和三层CNN体系结构的等效凸优化问题。我们首先证明,可以通过$ \ ell_2 $ norm norm norm juromantized covex程序在全球优化两层CNN。然后,我们表明单个relu层的多层圆CNN训练问题等同于$ \ ell_1 $正规凸面程序,该程序鼓励光谱域中的稀疏性。我们还将这些结果扩展到具有两个Relu层的三层CNN。此外,我们提出了我们对不同合并方法的方法的扩展,这阐明了隐式建筑偏见为凸正则化。

We study training of Convolutional Neural Networks (CNNs) with ReLU activations and introduce exact convex optimization formulations with a polynomial complexity with respect to the number of data samples, the number of neurons, and data dimension. More specifically, we develop a convex analytic framework utilizing semi-infinite duality to obtain equivalent convex optimization problems for several two- and three-layer CNN architectures. We first prove that two-layer CNNs can be globally optimized via an $\ell_2$ norm regularized convex program. We then show that multi-layer circular CNN training problems with a single ReLU layer are equivalent to an $\ell_1$ regularized convex program that encourages sparsity in the spectral domain. We also extend these results to three-layer CNNs with two ReLU layers. Furthermore, we present extensions of our approach to different pooling methods, which elucidates the implicit architectural bias as convex regularizers.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源