基于限制神经网络的正规化

论文标题

基于限制神经网络的正规化

Constraint-Based Regularization of Neural Networks

论文作者

Leimkuhler, Benedict, Pouchon, Timothée, Vlaar, Tiffany, Storkey, Amos

论文摘要

我们提出了一种有效地将约束结合到随机梯度Langevin框架中的方法，以训练深层神经网络。约束允许直接控制模型的参数空间。适当设计，它们减少了消失/爆炸的梯度问题，控制体重的幅度并稳定了深层神经网络，从而提高了训练算法的鲁棒性和训练有素的神经网络的概括能力。我们介绍了由重量矩阵和显式重量正常化的正交性保护动机的受限训练方法的例子。我们描述了Langevin动力学和阻尼不足形式的过度阻尼制剂中的方法，其中MONMA有助于提高采样效率。这些方法在图像分类和自然语言处理中的测试示例中进行了探讨。

We propose a method for efficiently incorporating constraints into a stochastic gradient Langevin framework for the training of deep neural networks. Constraints allow direct control of the parameter space of the model. Appropriately designed, they reduce the vanishing/exploding gradient problem, control weight magnitudes and stabilize deep neural networks and thus improve the robustness of training algorithms and the generalization capabilities of the trained neural network. We present examples of constrained training methods motivated by orthogonality preservation for weight matrices and explicit weight normalizations. We describe the methods in the overdamped formulation of Langevin dynamics and the underdamped form, in which momenta help to improve sampling efficiency. The methods are explored in test examples in image classification and natural language processing.

下载PDF全文

下载文献需遵守相关版权规定

论文标题