用于优化卷积神经网络的渠道定向梯度

论文标题

用于优化卷积神经网络的渠道定向梯度

Channel-Directed Gradients for Optimization of Convolutional Neural Networks

论文作者

Lao, Dong, Zhu, Peihao, Wonka, Peter, Sundaramoorthi, Ganesh

论文摘要

我们介绍了卷积神经网络的优化方法，可用于改善基于梯度的概括误差的优化。该方法仅需要对现有随机梯度的简单处理，可以与任何优化器结合使用，并且与随机梯度的计算相比，仅具有线性开销（参数数）。该方法通过计算损耗函数的梯度相对于有向后加权的L2或Sobolev指标的损失函数的梯度，该指标的效果可以跨参数张量的一定方向平滑梯度的组件。我们表明，沿输出通道方向定义梯度会导致性能提升，而其他方向可能有害。我们介绍了此类梯度的连续理论，其离散化以及对深网的应用。在基准数据集，几个网络和基线优化器上进行的实验表明，可以通过简单地计算出相对于输出通道的定向指标来改善概括误差的优化器。

We introduce optimization methods for convolutional neural networks that can be used to improve existing gradient-based optimization in terms of generalization error. The method requires only simple processing of existing stochastic gradients, can be used in conjunction with any optimizer, and has only a linear overhead (in the number of parameters) compared to computation of the stochastic gradient. The method works by computing the gradient of the loss function with respect to output-channel directed re-weighted L2 or Sobolev metrics, which has the effect of smoothing components of the gradient across a certain direction of the parameter tensor. We show that defining the gradients along the output channel direction leads to a performance boost, while other directions can be detrimental. We present the continuum theory of such gradients, its discretization, and application to deep networks. Experiments on benchmark datasets, several networks and baseline optimizers show that optimizers can be improved in generalization error by simply computing the stochastic gradient with respect to output-channel directed metrics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题