CNN体系结构的隐式凸正则化：在多项式时间内两层和三层网络的凸优化

论文标题

CNN体系结构的隐式凸正则化：在多项式时间内两层和三层网络的凸优化

Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time

论文作者

Ergen, Tolga, Pilanci, Mert

论文摘要

我们研究了具有RELU激活的卷积神经网络（CNN）的培训，并引入了与数据样本数量，神经元数量和数据维度相对于多项式复杂性的确切凸优化公式。更具体地说，我们利用半无限双重性开发了一个凸分析框架，以获得几种两层和三层CNN体系结构的等效凸优化问题。我们首先证明，可以通过$ \ ell_2 $ norm norm norm juromantized covex程序在全球优化两层CNN。然后，我们表明单个relu层的多层圆CNN训练问题等同于$ \ ell_1 $正规凸面程序，该程序鼓励光谱域中的稀疏性。我们还将这些结果扩展到具有两个Relu层的三层CNN。此外，我们提出了我们对不同合并方法的方法的扩展，这阐明了隐式建筑偏见为凸正则化。

We study training of Convolutional Neural Networks (CNNs) with ReLU activations and introduce exact convex optimization formulations with a polynomial complexity with respect to the number of data samples, the number of neurons, and data dimension. More specifically, we develop a convex analytic framework utilizing semi-infinite duality to obtain equivalent convex optimization problems for several two- and three-layer CNN architectures. We first prove that two-layer CNNs can be globally optimized via an $\ell_2$ norm regularized convex program. We then show that multi-layer circular CNN training problems with a single ReLU layer are equivalent to an $\ell_1$ regularized convex program that encourages sparsity in the spectral domain. We also extend these results to three-layer CNNs with two ReLU layers. Furthermore, we present extensions of our approach to different pooling methods, which elucidates the implicit architectural bias as convex regularizers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题