规范卷积神经网络

论文标题

规范卷积神经网络

Canonical convolutional neural networks

论文作者

Veeramacheneni, Lokesh, Wolter, Moritz, Klein, Reinhard, Garcke, Jochen

论文摘要

我们引入了卷积神经网络的规范重量归一化。受规范张量分解的启发，我们在所谓的规范网络中表示重量张量作为外部矢量产物的缩放总和。特别是，我们以分解形式训练网络权重，其中每种模式分别优化了比例权重。此外，与重量归一化类似，我们包括全局缩放参数。我们通过运行功率法和从高斯或统一分布随机绘制来研究规范形式的初始化。我们的结果表明，我们可以用从标准分布中绘制的便宜的初始化替换功率方法。规范的重新构造会导致MNIST，CIFAR10和SVHN数据集的竞争标准化性能。此外，该公式简化了网络压缩。一旦训练融合，规范形式就可以通过截断参数和来允许方便的模型压缩。

We introduce canonical weight normalization for convolutional neural networks. Inspired by the canonical tensor decomposition, we express the weight tensors in so-called canonical networks as scaled sums of outer vector products. In particular, we train network weights in the decomposed form, where scale weights are optimized separately for each mode. Additionally, similarly to weight normalization, we include a global scaling parameter. We study the initialization of the canonical form by running the power method and by drawing randomly from Gaussian or uniform distributions. Our results indicate that we can replace the power method with cheaper initializations drawn from standard distributions. The canonical re-parametrization leads to competitive normalization performance on the MNIST, CIFAR10, and SVHN data sets. Moreover, the formulation simplifies network compression. Once training has converged, the canonical form allows convenient model-compression by truncating the parameter sums.

下载PDF全文

下载文献需遵守相关版权规定

论文标题